Using an MM-Principle to Achieve Fast Image Data Estimation from Large Image Data Sets

ABSTRACT

A majorize-minimize (MM) mathematical principle is applied to least squares regularization estimation problems to effect efficient processing of image data sets to provide good quality images. In a ground penetrating radar application, these approaches can reduce processing time and memory use by accounting for a symmetric nature of a given radar pulse, accounting for similar discrete time delays between transmission of a given radar pulse and reception of reflections from the given radar pulse, and accounting for a short duration of the given radar pulse.

RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional application No. 61/766,569, filed Feb. 19, 2013, U.S. Provisional application No. 61/923,410, filed Jan. 3, 2014, and U.S. Provisional application No. 61/940,354, filed Feb. 14, 2014, each of which is incorporated by reference in its entirety herein.

GOVERNMENT LICENSE RIGHTS

This invention was made with government support under contract number W911NF-1120039 awarded by US Army Research Laboratory and the Army Research Office. The government has certain rights in the invention.

FIELD OF THE INVENTION

This invention relates generally to image data processing and more specifically to applying the majorize-minimize mathematical principle to achieve fast image data estimation for large image data sets.

BACKGROUND

Half of the coalition forces casualties in the Iraq and Afghanistan wars are attributed to land mines and improvised explosive devices (IEDs). Consequently, a critical goal of the US Army is to develop robust and effective land-mines/IED detection systems that are deployable in combat environments. Accordingly, there is a desire to create robust algorithms for sub-surface imaging using ground penetrating radar (GPR) data and thus facilitate higher IED detection rates and lower false alarm probabilities.

Referring to the example schematic of FIG. 1, a GPR imaging system transmits signals from an above ground transmitter 102 into the ground of a scene-of-interest (SOI) 104. Signals that are reflected off of objects 106, 108, and 110 in the SOI 104 are received by one or more receivers 112 to generate images that convey relevant information about the objects 106, 108, and 110 (also known as scatters) within the SOI 104. As a transmitted pulse propagates into a SOI 104, reflections occur whenever the pulse encounters changes in the dielectric constant (∈_(r)) of the material through which the pulse propagates. Such a transition occurs, for example, when the radar pulse moving through dirt encounters a metal object such as an IED. The strength of a reflection due to a patch of terrain can be quantified by its reflection coefficient, which is proportional to the overall change in dielectric constant within the patch.

In principle, GPR imaging is well-suited for detecting IEDs and land mines because these targets are expected to have much larger dielectric constants than their surrounding material, such as soil and rocks. It should be noted that for a high frequency transmission pulse (i.e., greater than 3 MHz), the backscattered signal of a target can be well approximated as the sum of the backscattered signals of individual elementary scatterers.

The phrase GPR image reconstruction refers to the process of sub-dividing a SOI into a grid of voxels (i.e., volume elements) and estimating the reflection coefficients of the voxels from radar-return data. Existing image formation techniques for GPR datasets include the delay-and-sum (DAS) or backprojection algorithm and the recursive side-lobe minimization (RSM) algorithm.

The DAS algorithm is probably the most commonly used image formation technique in radar applications because its implementation is straightforward. The DAS algorithm simply estimates the reflectance coefficient of a voxel by coherently adding up, across the receiver-aperture, all the backscatter contributions due to that specific voxel. Although the DAS algorithm is a fast and easy-to-implement method, it tends to produce images that suffer from large side-lobes and poor resolution. The identification of targets with relatively small radar cross section (RCS) is thus difficult from DAS images because targets with large a RCS produce large side-lobes that may obscure adjacent targets with a smaller RCS.

The RSM algorithm is an extension of the DAS algorithm that provides better noise and side-lobe reduction, but no improvement in image resolution. Moreover, results from the RSM algorithm are not always consistent. This may be attributed to the algorithm's use of randomly selected apertures or windows through which a measurement is taken. The requirement for a minimum threshold for probability detection and false alarms would make it difficult to use the RSM algorithm in practical applications.

Both the DAS and RSM algorithms fail to take advantage of valuable a-priori or known information about the scene-of-interest in a GPR context, namely sparsity. More specifically, because only a few scatterers are present in a typical scene-of-interest, in other words most of the backscatter data is zero, it is reasonable to expect better estimates of the reflectance coefficients when this a-priori sparsity assumption is incorporated into the image formation process.

Several linear regression techniques for sparse data set applications are known. Algorithms for sparse linear regression can be roughly divided into the following categories: “greedy” search heuristics, iterative re-weighted linear least squares algorithms, and linear inversion and deconvolution via l_(p)-regularized least-squares.

“Greedy” search heuristics such as projection pursuit, orthogonal matching pursuit (OMP), and the iterative deconvolution algorithm known as CLEAN comprise one category of algorithms for sparse linear regression. Although these algorithms have relatively low computational complexity, regularized least-squares methods have been found to perform better than greedy approaches for sparse reconstruction problems in many radar imaging problems. For instance, the known sparsity learning via iterative minimization (SLIM) algorithm incorporates a-priori sparsity information about the scene-of-interest and provides good results. However, its high computational cost and memory-size requirements may make it inapplicable in real-time settings.

Another known approach to sparse linear regression is the iterative re-weighted linear least-squares (IRLS), where the solution of the mathematical l₁-minimization problem is given by solving a sequence of re-weighted l₂-minimization problems. A conceptually similar approach is to compute the l₀-minimization by solving a sequence of re-weighted l₁-minimization problems.

Still another known approach to sparse linear regression are the linear inversion and deconvolution via l_(p)-regularized least-squares (LS) methods. In these methods, the reflection coefficients are estimated using

$\begin{matrix} {{\hat{x} = {{\underset{x}{\arg \; \min}{{y - {Ax}}}_{2}^{2}} + {\lambda {x}_{p}^{p}}}},} & (1) \end{matrix}$

where λ is the regularization parameter. l₁-regularization (i.e., p=1) incorporates the sparsity assumptions by approximating the minimum l₀ problem, which is to find the most sparse vector that fits the data model. Directly solving the l₀-regularization problem is typically not even attempted because it is known to be non-deterministic polynomial-time hard (NP-hard), i.e., very processing intensive to solve. To date, l₁-regularization has been the recommended approach for sparse radar image reconstruction.

So called l₁-LS algorithms incorporate the sparsity assumption, generally give acceptable results, and could be made reasonably fast via speed-up techniques or parallel/distributed implementations. LS-based estimation can, however, be ineffective and biased in the presence of outliers in the data. This is a particular disadvantage, however, because in practical settings, the presence of outliers in measurements is to be expected.

More specifically, the l₁-LS estimation method has been known for some time, wherein the concept has been popularized in the statistics and signal processing communities as the Least Absolute Selection and Shrinkage Operator (LASSO) and Basis Pursuit denoising, respectively. A number of iterative algorithms have been introduced for solving the l₁-LS estimation problem. Classical approaches use linear programming or interior-point methods. However, in many real-world and large scale problems, these traditional approaches suffer from high computational cost and lack of estimation accuracy. Heuristic greedy alternatives like Orthogonal Matching Pursuit and Least Angle Regression (LARS) have also been proposed. These algorithms are also likely to fail when applied to real-world, large-scale problems. Several other types of algorithms for providing l₁-LS estimates exist in the literature and others continue to be proposed.

SUMMARY

Generally speaking and pursuant to these various embodiments, the mathematical majorize-minimize principle is applied in various ways to process the image data to provide a more reliable image from the backscatter data using a reduced amount of memory and processing resources. In one approach, a processing device processes the initial data set by creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve an l₁-regularized least-squares estimation problem associated with a mathematical model of image data from the initial data set. The application of the majorize-minimize principle to this approach can be further optimized for the GPR context by accounting for a symmetric nature of a given radar pulse, accounting for similar discrete time delays between transmission of a given radar pulse and reception of reflections from the given radar pulse, and accounting for a short duration of the given radar pulse. Application of these assumptions results in a relatively straight forward algorithm that can produce higher quality images while using reduced memory and processing resources.

In a second approach, a processing device processes the initial data set by creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve the l₁-regularized least-absolute deviation (LAD) estimation problem associated with a mathematical model of image data from the initial data set. This so called l₁-LAD algorithm is more computationally expensive than some existing algorithms such as the standard DAS and LASSO algorithms; however, this approach can be optimized for the GPR context like the previous approach and provides robust handling of data outliers. Such optimizations result in substantial gains in computational speed and memory-usage are attainable via developed fast implementation techniques. Furthermore, because the estimation of reflectance coefficients is decoupled, parallel and/or distributed implementations can also be developed to increase computational speed.

In a third approach, the majorize-minimize principle is applied to solve an l₁-regularized least-squares estimation problem for an image data set output by the popular DAS algorithm. This approach also is computationally efficient and only takes approximately 5% of the time required by the DAS algorithm. In studies using real data, the images created according to this approach are an improvement over the DAS images in that they have reduced clutter and improved sparsity without a loss of known scatterers. Additionally, these images were comparable to images created using the l₁-regularized least-squares approach described above even though this third approach only takes 1% of the computational time as the above described l₁-regularized least-squares approach.

Accordingly, the above methods use the particularized data collected using transmitters and receivers to output images representing objects in a SOI. Devices, including various computer readable media, incorporating these methods then provide for display of image data using reduced processing and memory resources and at increased speed as processed according to these techniques.

These and other benefits may become clearer upon making a thorough review and study of the following detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

The above needs are at least partially met through provision of the methods and apparatuses for receiving and processing image data as described in the following detailed description, particularly when studied in conjunction with the drawings, wherein:

FIG. 1 comprises a schematic of operation a prior art GPR system;

FIG. 2 comprises a schematic of an example radar system as configured in accordance with various embodiments of the invention;

FIG. 3 comprises a perspective view of an example implementation of a prior art ultra wide-band (UWB) synchronous reconstruction (SIRE) radar system;

FIG. 4 comprises a schematic demonstrating a prior art approach to obtaining initial data from a SOI;

FIG. 5 comprises a graph demonstrating application of the mathematical majorize-minimize principle;

FIG. 6 comprises a flow diagram of an example algorithm applying the M-M principle to an L1-LS estimation as configured in accordance with various embodiments of the invention;

FIG. 7 comprises a graph displaying accuracies of various algorithms as applied to processing a given data set;

FIG. 8 comprises a displayed image of objects in a SOI using image data processed according to an M-M application to an L1-least squares method as configured in accordance with various embodiments of the invention;

FIG. 9 comprises a displayed image of the objects in the SOI of FIG. 8, in this case using image data processed according to a prior art DAS method;

FIG. 10 comprises a displayed image of the objects in the SOI of FIG. 8, in this case using image data processed according to a prior art RSM method;

FIG. 11 comprises a flow diagram of an example algorithm applying the M-M principle to an L1-LAD estimation as configured in accordance with various embodiments of the invention;

FIG. 12 comprises a graph displaying accuracies of various algorithms as applied to processing a given data set having an outlier;

FIG. 13 comprises a graph displaying accuracy of an example algorithm applying the M-M principle to an L1-LAD estimation as applied to processing the given data set having an outlier of FIG. 12;

FIG. 14 comprises a graph displaying accuracies of various algorithms as applied to processing a given data set without outliers;

FIG. 15 comprises a graph displaying a cost function for an example algorithm applying the M-M principle to an L1-LAD estimation as configured in accordance with various embodiments of the invention;

FIG. 16 comprises a displayed image of objects in a SOI using image data processed according to a prior art DAS algorithm;

FIG. 17 comprises a displayed image of the objects in the SOI of FIG. 16, in this case using image data processed according to an M-M application to an L1-LAD method as configured in accordance with various embodiments of the invention;

FIG. 18 comprises a flow diagram of an example algorithm applying the M-M principle to an L1-least squares estimation applied to an image data set from a DAS algorithm (L1-SIR) as configured in accordance with various embodiments of the invention;

FIG. 19 comprises a displayed image of objects in a SOI using image data processed according to a prior art DAS algorithm;

FIG. 20 comprises a displayed image of the objects in the SOI of FIG. 19, in this case using image data processed according to the L1-SIR algorithm as configured in accordance with various embodiments of the invention;

FIG. 21 comprises a displayed image of the objects in the SOI of FIG. 19, in this case using image data processed according to example algorithm applying the M-M principle to an L1-LS estimation as configured in accordance with various embodiments of the invention.

Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions and/or relative positioning of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of various embodiments of the present invention. Also, common but well-understood elements that are useful or necessary in a commercially feasible embodiment are often not depicted in order to facilitate a less obstructed view of these various embodiments. It will further be appreciated that certain actions and/or steps may be described or depicted in a particular order of occurrence while those skilled in the art will understand that such specificity with respect to sequence is not actually required. It will also be understood that the terms and expressions used herein have the ordinary technical meaning as is accorded to such terms and expressions by persons skilled in the technical field as set forth above except where different specific meanings have otherwise been set forth herein.

DETAILED DESCRIPTION

Referring now to the drawings, and in particular to FIG. 2, an illustrative apparatus that is compatible with many of these teachings will now be presented. In a GPR application, a vehicle 202 includes a plurality of radar transmission devices 204 and 206 mounted on the vehicle 202. The radar transmission devices 204 and 206 are configured to transmit radar pulses 208 and 210 into a scene of interest 212. A plurality of J radar reception devices 214 are mounted on the vehicle 202 and configured to detect magnitude of signal reflections from the scene of interest 212 from the radar pulses 208 and 210. The vehicle 202 can be any structure able to carry the radar transmitters and receivers to investigate a scene of interest.

FIG. 3 illustrates one example implementation: a truck 302 mounted ultra wide-band (UWB) synchronous reconstruction (SIRE) radar system developed by the US Army Research Laboratory (ARL) in Adelphi, Md. This system includes a left transmit antenna 304, a right transmit antenna 306, and 16 receive antennas 314. Other systems may have different numbers of and arrangement of transmit and receive antennas. The transmit and receive antennas are mounted to a support structure 321, which supports these elements on the truck 302. The truck 302 drives through a scene of interest while the transmit antennas 304 and 306 alternately transmit radar pulses and the receiving antennas 314 receive reflections of the transmitted radar pulses from backscatter objects in the scene of interest.

Referring again to the example of FIG. 2, the positions along the vehicle's 202 path at which a radar pulse is transmitted are referred to as the transmit locations, I. As the vehicle 202 moves, the two transmit antennas 204 and 206 alternately send respective probing pulses 208 and 210 toward the SOI 212, and the radar-return profiles reflected from the SOI 212 are captured by multiple receive antennas 214 at each transmit location.

A location determination device 220 detects the location of the vehicle 202 at times of transmission of the radar pulse from the plurality of radar transmission devices 204 and 206 and reception of the signal reflections by the radar reception devices 214. In one example, the location determination device is a global positioning system (GPS) device as commonly known and used, although other position determination devices can be used. Accordingly, the positioning coordinates of the active transmit antenna 204 or 206 and all the receive antennas 214 are also logged. When using the UWB SIRE system of FIG. 3, there is typically a minimum range of detection of objects in the SOI from the vehicle 2020 of about 8 meters, a maximum range of about 34 meters, and a cross-range of about 25 meters. The UWB SIRE system uses a FORD EXPLORER as the vehicle 202 such that the transmit antennas 204 and 206 have about a two meter separation between 16 receivers.

In one approach, the vehicle 202 includes a processing device 242 in communication with the location determining device 220, the transmit antennas 204 and 206, and the receivers 214 to coordinate their various operations and to store information related to their operations in a memory device 244. Optionally, a display 246 is included with the vehicle 202 to display an image related to the data received from the scanning of the scene of interest 212.

Due to the large size of the scene-of-interest, an initial data set is not generated by processing all voxels at once. Such an image would have cross-range resolution that varies from the near-range to the far-range voxels. The voxels in the near-range would have much larger resolution than those for the far-range ones. To create GPR images with consistent resolution across the scene-of-interest, we use the mosaicking approach discussed in L. Nguyen, “Signal and Image Processing Algorithms for the U.S. Army Research Laboratory Ultra-wideband (UWB) Synchronous Impulse Reconstruction (SIRE) Radar,” ARL Technical Report, ARL-TR-4784, April 2009, which is incorporated by reference and described with reference to FIG. 4. The steps taken to produce a complete image of the scene-of-interest (using the mosaicking approach) are described as follows. The image space associated with the scene-of-interest is divided into 32 sub-images of size 25×2 m². Each sub-image has 250 voxels in the cross-range direction and 100 voxels in the down-range direction. Thus, each sub-image has L=25000 voxels. The aperture (meaning the distance over which the vehicle travels while accumulating data for a SOI) is divided into 32 sub-apertures corresponding to separate, overlapping distances traveled by the vehicle, where adjacent sub-apertures (or vehicle travel windows) overlap by approximately 2 meters. Each sub-aperture has 43 transmit locations and is approximately of size 12×2 m². The radar-return and location positioning measurements associated with a sub-aperture are used to estimate the reflectance coefficients for the corresponding sub-image. The reconstructed sub-images are merged together to obtain the complete image of the scene-of-interest.

In another approach, referring again to FIG. 2, a separate computing device 260 may receive an initial data set from the vehicle based system to further process to create and optionally display images related to the SOI. The computing device 260 will typically include a processing device 262 in communication with a memory 264 to allow for processing the data according to any of the methods described herein. A display 266 may be included with or separate from the computing device 260 and controlled to display images resulting from the processing of the data received from the vehicle 202. Those skilled in the art will recognize and appreciate that such processing devices 242 and 262 can comprise a fixed-purpose hard-wired platform, including, for example, parallel processing devices, or can comprise a partially or wholly programmable platform. All of these architectural options are well known and understood in the art and require no further description here. Moreover, the memory devices 244 and 264 may be separate from or combined with the respective processing devices 242 and 262. Any memory device or arrangement capable of facilitating the processing described herein may be used.

With respect to the collection of data, consider a single scatterer, with spatial position p_(s), located at the center of a voxel (i.e., volume element) within the SOI. The spatial positions of the active transmit antenna and a receive antenna are denoted by p_(t) and p_(r), respectively. If the contributions of measurement noise are momentarily ignored, the relationship between the transmitted signal p(t) and the received signal g_(s)(t) can be modeled as

g _(s)(t)=α_(s) ·p(t−τ(p _(s) , p _(t) , p _(r)))·x _(s),   (2)

where x_(s) is the reflection coefficient of the voxel, τ(p_(s), p_(t), p_(r)) is the time it takes for the pulse to travel from the transmit antenna to the scatterer and back to the receive antenna, and α_(s) is the attenuation the pulse undergoes along the round-trip path.

The single scatterer model in (2) can be generalized to describe all the measurements captured by the SIRE GPR system. The SOI is subdivided into a rectangular grid of L voxels and the unknown reflection coefficient at the l^(th) voxel is denoted by x_(l). Extending the model in (2) to the SIRE system, the output of the j^(th) receive antenna at the i^(th) vehicle-stop is given by

$\begin{matrix} {{{s_{ij}(t)} = {{\sum\limits_{l = 1}^{L}{\alpha_{ijl} \cdot {p\left( {t - \tau_{ijl}} \right)} \cdot x_{l}}} + {w_{ij}(t)}}},{i = 1},2,\ldots \mspace{14mu},I,{j = 1},2,\ldots \mspace{14mu},{J.}} & (3) \end{matrix}$

In this equation, τ_(ijl) is the time it takes for the transmitted pulse to propagate from the active transmit antenna at the i^(th) transmit location to the l^(th) voxel and for the backscattered signal to return to the j^(th) receive antenna. The parameter τ_(ijl) is given by

$\begin{matrix} {{\tau_{ijl} = \frac{d_{il} + d_{ijl}}{c}},} & (4) \end{matrix}$

where d_(il) denotes the distance from the active transmit antenna at the i^(th) transmit location to the l^(th) voxel, d_(ijl) denotes the return distance from the l^(th) voxel to the j^(th) receive antenna when the truck is at the i^(th) transmit location, and c is the speed of light.

The notation α_(ijl) is the propagation loss that the transmitted pulse under-goes as it travels from the active transmit antenna at the i^(th) transmit location to the l^(th) voxel and back to the j^(th) receive antenna. The parameter α_(ijl) is given by

$\begin{matrix} {\alpha_{ijl} = {\frac{1}{d_{il} \cdot d_{ijl}}.}} & (5) \end{matrix}$

The notation w_(ij)(l) represents the noise contribution.

The above mathematical model defined in (3) is continuous whereas, in practice, the SIRE GPR system only stores discrete and separate sampled versions of the return signals. Thus, we introduce the following discrete-time signals to adpat the above model to the real world application: for i=1, 2, . . . , l, j=1, 2, . . . , J and n=0, 1, . . . , N−1,

y_(ij)[n]

s_(ij)(nT_(s))   (6)

e_(ij)[n]

w_(ij)(nT_(s))   (7)

where T_(s) is the sampling interval and N is the number of samples per radar return. From (6) and (7) we can write

$\begin{matrix} {{y_{ij}\lbrack n\rbrack} = {{\sum\limits_{l = 1}^{L}{\alpha_{ijl} \cdot {p\left( {{nT}_{s} - \tau_{ijl}} \right)} \cdot x_{l}}} + {{e_{ij}\lbrack n\rbrack}.}}} & (8) \end{matrix}$

The corresponding system of N equations can be written in matrix form as

y _(ij) =D _(ij) P _(ij) x+e _(ij)   (9)

where

$\begin{matrix} {{y_{ij}\overset{\Delta}{=}\begin{bmatrix} {y_{ij}\lbrack 0\rbrack} \\ {y_{ij}\lbrack 1\rbrack} \\ \vdots \\ {y_{ij}\left\lbrack {N - 1} \right\rbrack} \end{bmatrix}},{x\overset{\Delta}{=}\begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{L} \end{bmatrix}},{e_{ij}\overset{\Delta}{=}{\begin{bmatrix} {e_{ij}\lbrack 0\rbrack} \\ {e_{ij}\lbrack 1\rbrack} \\ \vdots \\ {e_{ij}\left\lbrack {N - 1} \right\rbrack} \end{bmatrix}.}}} & (10) \end{matrix}$

The matrix D_(ij) is an L×L diagonal matrix containing the attenuation coefficients that is given by

$\begin{matrix} {{D_{ij}\begin{bmatrix} \alpha_{{ij}\; 1} & 0 & \ldots & 0 \\ 0 & \alpha_{{ij}\; 2} & \ldots & 0 \\ \vdots & 0 & \ddots & \vdots \\ 0 & \ldots & 0 & \alpha_{ijL} \end{bmatrix}}.} & (11) \end{matrix}$

The matrix P_(ij) is an N×L matrix containing shifted versions of the transmitted pulse that is defined to be

$\begin{matrix} {{\mspace{734mu} (12)}{P_{ij}\overset{\Delta}{=}{\quad{\left\lbrack \begin{matrix} {p\left( {{0 \cdot T_{s}} - \tau_{{ij}\; 1}} \right)} & {p\left( {{0 \cdot T_{s}} - \tau_{{ij}\; 2}} \right)} & \ldots & {p\left( {{0 \cdot T_{s}} - \tau_{ijL}} \right)} \\ {p\left( {{1 \cdot T_{s}} - \tau_{{ij}\; 1}} \right)} & {p\left( {{1 \cdot T_{s}} - \tau_{{ij}\; 2}} \right)} & \ldots & {p\left( {{1 \cdot T_{s}} - \tau_{ijL}} \right)} \\ \vdots & \vdots & \ddots & \vdots \\ {p\left( {{\left( {N - 1} \right) \cdot T_{s}} - \tau_{{ij}\; 1}} \right)} & {p\left( {{\left( {N - 1} \right) \cdot T_{s}} - \tau_{{ij}\; 2}} \right)} & \ldots & {p\left( {{\left( {N - 1} \right) \cdot T_{s}} - \tau_{ijL}} \right)} \end{matrix} \right\rbrack .}}}} & \; \end{matrix}$

In other words, the sampled data vectors (i.e., values for position and signal for transmission and reception of radar pulses) for all transmitters and receivers pairs {y_(ij)} are concatenated to obtain a K×1(K=IJN) data vector y. Extending the model in (9) to account for all I·J transmitter and receiver pairs yields the desired model

y=Ax+e,   (13)

where the K×1 data vector y, K×L system matrix A, and K×1 Gaussian noise vector e are given by

$\begin{matrix} {{y\overset{\Delta}{=}\begin{bmatrix} y_{11} \\ y_{12} \\ \vdots \\ y_{1J} \\ y_{21} \\ \vdots \\ y_{2\; J} \\ \vdots \\ y_{I\; 1} \\ \vdots \\ y_{IJ} \end{bmatrix}},{A\overset{\Delta}{=}\begin{bmatrix} {P_{11}D_{11}} \\ {P_{12}D_{12}} \\ \vdots \\ {P_{1J}D_{1J}} \\ {P_{21}D_{21}} \\ \vdots \\ {P_{2J}D_{2J}} \\ \vdots \\ {P_{I\; 1}D_{I\; 1}} \\ \vdots \\ {P_{IJ}D_{IJ}} \end{bmatrix}},{e\overset{\Delta}{=}{\begin{bmatrix} e_{11} \\ e_{12} \\ \vdots \\ e_{1J} \\ e_{21} \\ \vdots \\ e_{2J} \\ \vdots \\ e_{I\; 1} \\ \vdots \\ e_{IJ} \end{bmatrix}.}}} & (14) \end{matrix}$

Because the SIRE GPR system uses an UWB radar, the duration of the transmitted pulse p(t) is relatively short so that the system matrix A is sparse.

Given the pulse p(t), location (e.g., GPS) data, and observation-vector y, the objective is to estimate the unknown reflection coefficient vector x, which represents the material reflecting radar pulses in the SOI Displaying this reflection coefficient data will correspond to displaying the objects in the SOI.

The Majorize-Minimize Principle

The MM (which stands for majorize-minimize in minimization problems, and minimize-majorize in maximization problems) principle is a prescription for constructing solutions to optimization problems. An MM algorithm minimizes an objective function by successively minimizing, at each iteration, a judiciously chosen objective function that is known as a majorizing function. Whenever a majorizing function is optimized, in principle, a step is taken toward reaching the minimizer of the original objective function. A brief summary of the MM principle is now given with reference to FIG. 5.

Let ƒ be a function to be minimized over some domain D ∈

^(L), i.e., the function's minimum value is to be found within the given domain. A real value function g with domain D×D is said to majorize ƒ if

g(x, y)≧ƒ(x) for all x, y ∈ D   (15)

g(x, x)=ƒ(x) for all x ∈ D.   (16)

Suppose the majorizing function g is easier to minimize than the original objective function ƒ. Then, the MM algorithm for minimizing ƒ is given by

$\begin{matrix} {{x^{({m + 1})} = {\underset{x \in D}{\arg \; \min}{g\left( {x,x^{(m)}} \right)}}},} & (17) \end{matrix}$

where x^((m)) is the current estimate for the minimizer of f. The algorithm defined by (17), which is illustrated in FIG. 5, is guaranteed to monotonically decrease the objective function ƒ with increasing iteration. In other words, a further minimal or smaller value for the function ƒ is obtained with each iteration of the algorithm, stepping between successive functions g. To see this result, first observe that, by definition,

g(x ^((m+1)) , x ^((m)))≦g(x ^((m)) , x ^((m)))   (18)

Now from (15) and (16), it follows that

f(x ^((m+1)) ≦g(x ^((m+1)) , x ^((m)))≦g(x ^((m)) , x ^((m)))=f(x ^((m)))   (19)

In other words and as illustrated in FIG. 5, the function g(x,x^((n))) intersects with the function ƒ(x) at x^((n)) and also has a further minimum at point x^((n+1)). That further minimum point is used in the next iteration as a new g(x^((n))) from which a new minimum at a new x^((n+1)) may be determined.

MM-Based Image Reconstruction Using L1-Regularization

Using the above parameters, in a first approach to providing a fast and accurate image by exploiting the known sparsity of the scatterers, the object data represented by the reflection coefficient vector is estimated using the well-established l₁-LS estimation method

$\begin{matrix} {{\hat{x} = {{\underset{x}{\arg \; \min}{{y - {Ax}}}_{2}^{2}} + {\lambda {x}_{1}}}},} & (20) \end{matrix}$

where λ is the regularization parameter or penalty parameter. In contrast to previous approaches, the optimization problem in (20) is solved using the above described MM framework, which leads to an iterative algorithm that is efficient, straightforward to implement and amenable to parallelization. Additionally, the algorithm is guaranteed to monotonically decrease the objective function in (20) to guarantee coming to a final result through the iteration.

Recall that the objective function to be minimized is of the form

φ(x)=φ₁(x)+λφ₂(x)   (21)

where φ₁(x)

∥y−Ax∥₂ ² and φ₂(x)

∥x∥₁, and the regularization or penalty parameter λ is strictly positive. To find a majorizer for the function φ₁, we use a result from DePierro (A. R. De Pierro, “A modified expectation maximization algorithm for penalized likelihood estimation in emission tomography,” Medical Imaging, IEEE Transactions on, vol. 14, no. 1, pp. 132 to 137, 1995, which is incorporated by reference herein) outlined as follows. First, φ₁(x) is expressed as

$\begin{matrix} {{{\varphi_{1}(x)} = {{\sum\limits_{k = 0}^{k - 1}y_{k}^{2}} - {2{\sum\limits_{k = 0}^{K - 1}{y_{k}\lbrack{Ax}\rbrack}_{k}}} + {\sum\limits_{k = 0}^{K - 1}\left( \lbrack{Ax}\rbrack_{k} \right)^{2}}}}{where}} & (22) \\ {\lbrack{Ax}\rbrack_{k}{\sum\limits_{l = 1}^{L}{A_{kl}x_{l}}}} & (23) \end{matrix}$

is the k^(th) component of the vector Ax. They then exploit the convexity of the square function and construct a majorizing function for ([Ax]_(k))². By denoting r_(k) as the number of nonzero elements in the k^(th) row of A and defining

$\begin{matrix} {c_{kl}\left\{ {\begin{matrix} {r_{k}^{- 1},} & {A_{kl} \neq 0} \\ {0,} & {{A_{kl} = 0},} \end{matrix}{they}\mspace{14mu} {have}} \right.} & (24) \\ {\left( \lbrack{Ax}\rbrack_{k} \right)^{2} = \left( {\sum\limits_{l = 1}^{L}{c_{kl}\left( {{r_{k}A_{kl}x_{l}} - {r_{k}A_{kl}x_{l}^{(m)}} + \left\lbrack {Ax}^{(m)} \right\rbrack_{k}} \right)}} \right)^{2}} & (25) \end{matrix}$

for any vector x^((m)) in

^(L). Because

$\begin{matrix} {{\sum\limits_{l = 1}^{L}c_{kl}} = 1} & (26) \end{matrix}$

for all k, it follows from the convexity of the square function that

([Ax] _(k))² ≦q(x, x ^((m)))   (27)

where

$\begin{matrix} {{q\left( {x,x^{(m)}} \right)}{\sum\limits_{l = 1}^{L}{{c_{kl}\left( {{r_{k}A_{kl}x_{l}} - {r_{k}A_{kl}x_{l}^{(m)}} + \left\lbrack {Ax}^{(m)} \right\rbrack_{k}} \right)}^{2}.}}} & (28) \end{matrix}$

From Equation (27) and the fact that q (x^((m)), x^((m)))=([Ax]_(k))², it follows that q is a majorizing function for ([Ax]_(k))². Thus replacing ([Ax]_(k))² by q (x , x^((m))) in (22) produces

$\begin{matrix} {{Q_{1}\left( {x,x^{(m)}} \right)} = {{\sum\limits_{k = 0}^{K - 1}y_{k}^{2}} - {\sum\limits_{k = 0}^{K - 1}{y_{k}\lbrack{Ax}\rbrack}_{k}} + {\sum\limits_{k = 0}^{K - 1}{{q\left( {x,x^{(m)}} \right)}.}}}} & (29) \end{matrix}$

the desired majorizing function for φ₁.

A quadratic majorizer for the absolute value function |x| was derived by de Leeuw and Lange (J. de Leeuw and K. Lange, “Sharp quadratic majorization in one dimension,” Computational statistics and data analysis, vol. 53, no. 7, pp. 2471 to 2484, 2009, which is incorporated by reference herein) where

$\begin{matrix} {{z\left( {x,y} \right)} = {{\sum\limits_{l = 1}^{L}\frac{x^{2}}{2{y}}} + {\frac{1}{2}{{y}.}}}} & (30) \end{matrix}$

It follows readily from this result that a majorizing function for the function φ₂ is

$\begin{matrix} {{Q_{2}\left( {x,x^{(m)}} \right)} = {\sum\limits_{l = 1}^{L}{{z\left( {x_{l},x_{l}^{(m)}} \right)}.}}} & (31) \end{matrix}$

Because λ is positive, a majorizing function for the l₁-LS objective function φ is

Q(x, x ^((m)))=Q ₁(x, x ^((m)))+λQ₂(x, x ^((m)))   (32)

From the general expression in provided by G. Davis, S. Mallat, and M. Avellaneda, “Adaptive greedy approximations,” Constructive approximation, vol. 13, no. 1, pp. 57 to 98, 199, which is incorporated by reference herein, it follows that the next iterate is given by

$\begin{matrix} {x^{({m + 1})} = {{{Q\left( {x,x^{(m)}} \right)}.}}} & (33) \end{matrix}$

We obtain the desired iterative algorithm by setting to zero the derivative of Q (x, x^((m))) with respect to the components of x. Straightforward calculations show that the partial derivative of Q (x, x^((m))) with respect to x₁ is given by

$\begin{matrix} {\frac{\partial{Q\left( {x,x^{(m)}} \right)}}{\partial x_{l}} = {{{- 2}{\sum\limits_{k = 0}^{K - 1}{y_{k}A_{kl}}}} + {2{\sum\limits_{k = 0}^{K - 1}\left( {{r_{k}A_{kl}^{2}x_{l}} - {r_{k}A_{kl}^{2}x_{l}^{(m)}} + {A_{kl}\left\lbrack {Ax}^{(m)} \right\rbrack}_{k}} \right)}} + {\lambda \; {\frac{x_{l}}{x_{l}^{(m)}}.}}}} & (34) \end{matrix}$

Setting the derivatives in (34) to zero leads to the following MM algorithm for the l₁-LS estimation problem, representing application of a majorize-minimize principle to solve an l₁-regularized least-squares estimation problem associated with a mathematical model of image data from the initial data set:

$\begin{matrix} {{x_{l}^{({m + 1})} = \frac{{H_{l} \cdot x_{l}^{(m)}} + G_{l}^{(m)}}{H_{l} + \frac{\lambda}{2{x_{l}^{(m)}}}}}{for}{{l = 1},2,\ldots \mspace{14mu},L}{where}} & (35) \\ {{H_{l} = {\sum\limits_{k = 0}^{K - 1}{r_{k}A_{kl}^{2}}}},} & (36) \\ {G_{l}^{(m)} = {\sum\limits_{k = 0}^{K - 1}{{A_{kl}\left( {y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}} \right)}.}}} & (37) \end{matrix}$

In other words, one may iteratively derive an estimated image value for a given voxel using the equation (35). Accordingly, a processing device may be configured to process an initial data set by creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve an l₁-regularized least-squares estimation problem associated with a mathematical model of image data from the initial data set.

One readily observes in (35) that the estimation of individual reflectance coefficients is decoupled because, for a given pixel, the computation of the next estimate x₁ ^((m+1)) only depends on the current estimate x₁ ^((m)). The proposed algorithm is thus amenable to parallel, distributed, and/or graphics processing unit (GPU) processing, which can further expedite processing of the image data over other processing approaches that cannot be implemented using such processing techniques. Moreover, this approach can be readily applied to a variety of applications where datasets are collected using synthetic aperture imaging measurement principles.

A further benefit of this approach is the stability of the algorithm because of its convergence properties. To illustrate this benefit, certain theoretical results on the convergence of MM algorithms are described below to analyze the convergence properties of the above described MM-based l₁-LS algorithm.

First, we re-state here the so called Condition C2 and Theorem 3 of F. Vaida, “Parameter convergence for EM and MM algorithms,” Statistica Sinica, vol. 15, no. 3, p. 831, 2005, which is incorporated herein by reference, in the context of the MM algorithms, where they apply with minor modifications. These modifications include: (1) the regularity condition R4, which concern the missing data distribution, is not necessary; and (2) for the regularity condition R5 and the condition C2, the expected log-likelihood function of the augmented data is now replaced by the majorizing function.

For the Condition C2 aspect as discussed in the Vaida reference, let S₁₀₀ be

the set of stationary points defined as

$\begin{matrix} {{S_{\varphi} = \left\{ {{x^{*}\text{:}\mspace{14mu} \frac{\partial\;}{\partial x}{\varphi \left( x^{*} \right)}} = 0} \right\}},} & (38) \end{matrix}$

where φ is the l₁-LS cost function. For all x ∈ S_(φ), there exists a unique global minimizer of the majorizing function Q.

For the Theorem 3 aspect as discussed in the Vaida reference, consider an MM iteration sequence {x^((m))} that is defined by the starting point)x⁽⁰⁾ and iteration x^((m+1))=M (x^((m))). If Condition C2 holds, then for any starting point x^((m))→x* as m→∞, for some stationary point x* in S₁₀₀ . Moreover, x*=M (x*) and, if x^((m))≠x* for all m, the sequence of cost function values φ(x^((m))) is strictly decreasing to φ(x*).

Theorem 3 gives a simple condition to test the convergence of MM algorithms; that is, if the global minimum of the majorizing function Q(·, x*) is unique for all x* ∈ S_(φ), then the sequence the MM iterates {x^((m)): m=0, 1, . . . } will converge to a stationary point. We now show that the majorizing function Q for the l₁-LS cost function φ is strictly convex and, thus has a unique global minimizer for all stationary points.

The second-order partial derivatives of the function Q(·, x*) are given by

$\begin{matrix} {\frac{\partial^{2}{Q\left( {x,x^{*}} \right)}}{\partial x_{l}^{2}} = \left\{ \begin{matrix} {{2{\sum\limits_{k = 0}^{K - 1}{r_{k}A_{kl}^{2}}}},\frac{\lambda}{\left| x_{l}^{*} \right|},} & {l = k} \\ {0,} & {l \neq {k.}} \end{matrix} \right.} & (39) \end{matrix}$

Thus, the Hessian matrix of Q(·, x*) is diagonal. By the principles disclosed by T. T. Wu and K. Lange, “The MM alternative to EM,” Statistical Science, vol. 25, no. 4, pp. 492 to 505, 2010, which is incorporated by reference herein, x*_(i) must be non-zero for all l. Therefore, from (39) it follows that

$\begin{matrix} {{{0 < \frac{\partial^{2}{Q\left( {x,x^{*}} \right)}}{\partial x_{l}^{2}} < \infty},{{for}\mspace{14mu} {all}}}{x,{x^{*} \in \Omega},}} & (40) \end{matrix}$

which implies that the Hessian matrix of Q(·, x*) is strictly positive definite. Consequently, Q(·, x*) is a strictly convex function and thus has a unique global minimum for all x* ∈ S_(φ). Finally, by Theorem 3, the MM-based l₁-LS algorithm is guaranteed to converge to a stationary point.

Description of the Fast Implementation

When applied in a typical GPR context, the computation of the term G_(l) ^((m)) in (35) requires the K×L matrix A where K=IJN. For the above described UWB SIRE radar system, these parameters are I=43 transmit locations, J=16 receive antennas, N=1350 data samples per return-profile, and L=25000 pixels.

These parameter settings require 173 gigabytes (GB) of memory to merely store the system matrix A. Because A has many zero-elements, however, the data could be more efficiently stored as a sparse matrix. Nevertheless, a sparse representation for A would still require approximately 16 GB of memory. With such a large memory size requirement, the construction of the A matrix in the current formulation of the algorithm is not feasible or practical for typical computing platforms, especially in field deployment where on site imaging would be advantageous. Indeed, virtually any other GPR image formation method that requires explicitly constructing the system matrix would have comparable requirements.

In addition to memory size challenges, computational cost would also be an issue for the current format of the MM-based l₁-LS algorithm. At each iteration, the computation of G_(l) ^((m)) would require the matrix multiplication Ax^((m)), which has O(KL) time complexity. This operation is thus not practical for large-scale implementations where the parameters K and L are expected to be relatively large. For example, in our case, we have K=27520 and L=25000. Additional costs include the computation of the term H_(l) where the number of non-zero elements in each of the K rows of A is needed. To arrive at a fast and memory-efficient implementation of the algorithm in (35), the following acceleration techniques may be implemented.

Fast Implementation of G_(l) ^((m))

In a GPR context, the mathematical expressions at equations (36) and (37) above can be modified to reduce processing time and required memory by accounting for a symmetric nature of a given radar pulse, accounting for similar discrete time delays between transmission of a given radar pulse and reception of reflections from the given radar pulse, and accounting for a short duration of the given radar pulse. Accordingly, the equation for determining estimates for values x representing reflectance coefficients of the objects in the SOI, (35), involves calculation of the terms G_(l) ^((m)) and H_(l), which calculation can be streamlined according to the above assumptions. In application, a processing device is configured to calculate terms used to obtain the estimated value. Pursuant to these aspects, the expression for G_(l) ^((m)) in (37) can be written as

$\begin{matrix} {G_{l}^{(m)} = {\sum\limits_{i = 1}^{I}{\sum\limits_{j = 1}^{J}{\sum\limits_{n = 0}^{N - 1}{\alpha_{ijl} \cdot {p\left( {{nT}_{s} - \tau_{ijl}} \right)} \cdot {g_{ij}^{(m)}\left( {nT}_{s} \right)}}}}}} & (41) \end{matrix}$

where, for n=0, 1, . . . , N−1,

$\begin{matrix} {{g_{ij}^{(m)}\left( {nT}_{s} \right)} = {{y_{ij}\lbrack n\rbrack} - {\sum\limits_{l = 1}^{L}{\alpha_{ijl} \cdot x_{l}^{(m)} \cdot {{p\left( {{nT}_{s} - \tau_{ijl}} \right)}.}}}}} & (42) \end{matrix}$

To facilitate a fast implementation, we approximate the quantity G_(l) ^((m)) by

$\begin{matrix} {{{\hat{G}}_{l}^{(m)}\overset{\Delta}{=}{\sum\limits_{i = i}^{I}\; {\sum\limits_{j = 1}^{J}\; {\sum\limits_{n = 0}^{N - 1}\; {\alpha_{ijl} \cdot {p\left( {{nT}_{s} - {n_{ijl}T_{s}}} \right)} \cdot {{\hat{g}}_{ij}^{(m)}\left( {nT}_{s} \right)}}}}}},{where}} & (43) \\ {n_{ijl}\overset{\Delta}{=}{{round}\mspace{14mu} \left( \frac{\tau_{ijl}}{T_{s}} \right)}} & (44) \\ {{{\hat{g}}_{ij}^{(m)}\left( {nT}_{s} \right)}\overset{\Delta}{=}{{y_{ij}\lbrack n\rbrack} - {{\hat{s}}_{ij}^{(m)}\left( {nT}_{s} \right)}}} & (45) \\ {{{\hat{s}}_{ij}^{(m)}\left( {nT}_{s} \right)}\overset{\Delta}{=}{\sum\limits_{l = 1}^{L}\; {\alpha_{ijl} \cdot x_{l}^{(m)} \cdot {{p\left( {{nT}_{s} - {n_{ijl}T_{s}}} \right)}.}}}} & (46) \end{matrix}$

We will refer to the set of values {n_(ijl}) as the discrete-time delays. We can write Ĝ_(l) ^((m)) as

$\begin{matrix} {{{\hat{G}}_{l}^{(m)} = {\sum\limits_{i = i}^{I}\; {\sum\limits_{j = 1}^{J}{\alpha_{ijl} \cdot {\hat{G}}_{ijl}^{(m)}}}}}\mspace{14mu} {where}} & (47) \\ \begin{matrix} {{\hat{G}}_{ijl}^{(m)} = {\sum\limits_{n = 0}^{N - 1}\; {{p\left( {{nT}_{s} - {n_{ijl}T_{s}}} \right)} \cdot {{\hat{g}}_{ij}^{(m)}\left( {nT}_{s} \right)}}}} \\ {= {\sum\limits_{n = 0}^{N - 1}\; {{{w\left\lbrack {n - n_{ijl}} \right\rbrack} \cdot {{\hat{g}}_{ij}^{(m)}\lbrack n\rbrack}}(49)}}} \\ {= {\left\{ {\sum\limits_{n = 0}^{N - 1}\; {{w\left\lbrack {n - k} \right\rbrack}{{\hat{g}}_{ij}^{(m)}\lbrack n\rbrack}}} \right\} _{k = n_{ijl}}(50)}} \end{matrix} & (48) \end{matrix}$

with w[n]

p(nT_(s)). Because the transmitted pulse p(t) is symmetric, w[n−k]=w[k−n] holds for all n and k, and thus

$\begin{matrix} \begin{matrix} {{\hat{G}}_{ijl}^{(m)} = {\left\{ {\sum\limits_{n = 0}^{N - 1}\; {{w\left\lbrack {k - n} \right\rbrack}{{\hat{g}}_{ij}^{(m)}\lbrack n\rbrack}}} \right\} _{k = n_{ijl}}}} \\ {= {\left\{ {\left( {w*{\hat{g}}_{ij}^{(m)}} \right)\lbrack k\rbrack} \right\} _{k = n_{ijl}}(52)}} \\ {= {\left\{ {\left( {w*\left( {y_{ij} - {\hat{s}}_{ij}^{(m)}} \right)} \right)\lbrack k\rbrack} \right\} _{k = n_{ijl}}{.(53)}}} \end{matrix} & (51) \end{matrix}$

It is readily observed that computing Ĝ_(ijl) ^((m)) requires the convolution of the discrete pulse w[n] with the m^(th) iteration of the error-term sequence (y_(ij)[n]−ŝ_(ij) ^((m))[n]). The sequences w[n] and y_(ij)[n] are given. Hence, to efficiently compute Ĝ_(ijl) ^((m)), a computationally efficient way for generating the sequence ŝ_(ij) ^((m))[n] is needed.

First, we note that the collection of discrete-time delays {n_(ijl)} is expected to have repeated values. Let k_(min) and k_(max) denote respectively the minimum and maximum discrete-time delays. The sifting property of the unit impulse function can be used to write

$\begin{matrix} {\begin{matrix} {{{\hat{s}}_{ij}^{(m)}\lbrack n\rbrack} = {\sum\limits_{k = k_{\min}}^{k_{\max}}\; \left\{ {\sum\limits_{l = 1}^{L}\; {\alpha_{ijl} \cdot x_{l}^{(m)} \cdot {w\left\lbrack {n - n_{ijl}} \right\rbrack} \cdot {\delta \left\lbrack {k - n_{ijl}} \right\rbrack}}} \right\}}} \\ {= {\sum\limits_{k = k_{\min}}^{k_{\max}}\; {\left\{ {\sum\limits_{l = 1}^{L}\; {\alpha_{ijl} \cdot x_{l}^{(m)} \cdot {w\left\lbrack {n - k} \right\rbrack} \cdot {\delta \left\lbrack {k - n_{ijl}} \right\rbrack}}} \right\} \mspace{135mu} (55)}}} \\ {= {\sum\limits_{k = k_{\min}}^{k_{\max}}\; {\left\{ {\sum\limits_{l = 1}^{L}\; {\alpha_{ijl} \cdot x_{l}^{(m)} \cdot {\delta \left\lbrack {k - n_{ijl}} \right\rbrack}}} \right\} {w\left\lbrack {n - k} \right\rbrack}\mspace{149mu} (56)}}} \\ {= {\sum\limits_{k = k_{\min}}^{k_{\max}}{{{q_{ij}\lbrack k\rbrack} \cdot {w\left\lbrack {n - k} \right\rbrack}}\mspace{365mu} (57)}}} \\ {= {{\left( {q_{ij}*w} \right)\lbrack n\rbrack}\mspace{475mu} (58)}} \end{matrix}{where}} & (54) \\ {{q_{ij}\lbrack k\rbrack}\overset{\Delta}{=}{\sum\limits_{l = 1}^{L}\; {\alpha_{ijl} \cdot x_{l}^{(m)} \cdot {{\delta \left\lbrack {k - n_{ijl}} \right\rbrack}.}}}} & (59) \end{matrix}$

The term q_(ij)[k] can then be expressed as

$\begin{matrix} {{q_{ij}\lbrack k\rbrack}\overset{\Delta}{=}{\sum\limits_{l \in _{k}}\; d_{ijl}^{(m)}}} & (60) \end{matrix}$

where d_(ijl) ^((m))=α_(ijl)·x_(l) ^((m)) and S_(k)={l=1, 2, . . . , L:n_(ijl)=k}.

In other words, the term q_(ij)[k] is computed by accumulating all elements of

d _(ijl) ^((m)) =[d _(ij1) ^((m)) , d _(ij2) ^((m)) , . . . , d _(ijL) ^((m))]  (61)

for which associated discrete-time delay indexes n_(ijl) have the same integer value k. Consequently, q_(ij)[k] can then be computed in a very efficient manner using the hash table data structure concept. The indexes of a hash table, typically referred to as keys, are the integers between k_(min) and k_(max), and the record associated with the k^(th) key is the set of values

{d _(ijl) ^((m)): l=1, 2, . . . , L; n _(ijl) =k}.   (62)

By one approach, the hash-table-based computation of q_(ij)[k] is implemented using a processing device configured to use MATLAB using the accumarray function. The variables d, n and q store the following sequences:

d←d _(ijl) ^((m)) =[d _(ij1) ^((m)) , d _(ij2) ^((m)) , . . . , d _(ijL) ^((m)])  (63)

n←n _(ijl) =[n _(ij1) ,n _(ij2) , . . . , n _(ijL)]  (64)

q←q _(ij) =[q _(ij)[1], q _(ij)[2], . . . ,q _(ij) [k _(max)]]  (65)

The variable q is computed via the command q=accumarray(n, d) where k_(min)≦n_(ijl)≦k_(max) for all l, q_(ij)[k]=0 for all indexes k<k_(min). An example of pseudocode to be run by the processing device for implementation of the proposed algorithm for efficiently computing G_(l) ^((m)) is given below.

▪ Subroutine 1: Pseudocode for computing G_(l) ^((m)) for l = 1, 2, . . . , L for i = 1, 2, . . . , I do  for j = 1, 2, . . . , J do   SET q_(ij)[k] = 0 for 0 ≦ k < k_(min)   for k = k_(min), k_(min) + 1, . . . , k_(max) do    S_(k) = {l = 1, 2, . . . , L:n_(ijl) = k}     ${q_{ij}\lbrack k\rbrack} = {\sum\limits_{l \in _{k}}{d_{ijl}^{(m)}\mspace{14mu} \left( {{hash}\text{-}{table}\text{-}{implementation}} \right)}}$   end for   ŝ_(ij) ^((m))[n] = (q_(ij) * w)[n]   for l = 1, 2, . . . , L do    Ĝ_(ijl) ^((m)) = {(w * (y_(ij) − ŝ_(ij) ^((m))))[k]}|_(k=n) _(ijl)   end for  end for end for ${\hat{G}}_{l}^{(m)} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl} \cdot {\hat{G}}_{ijl}^{(m)}}}}$

In addition to being more computationally efficient, the proposed implementation does not require constructing the large system matrix A. A tangible benefit of this fact is the size of data (i.e., the number of transmit locations) that can be used to form an image is no longer limited. It is also readily observed from the pseudocode that the computation of Ĝ_(l) ^((m)) is parallelizable such that faster processing techniques such as parallel or GPU based processing can be used to process the data.

Fast Implementation of H_(l)

An alternative expression for H_(l) in (6) is

$\begin{matrix} {{H_{l} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\sum\limits_{n = 1}^{N}\; {\left( \alpha_{ijl} \right)^{2} \cdot r_{ijn} \cdot {\beta \left( {{nT}_{s} - \tau_{ijl}} \right)}}}}}},} & (66) \end{matrix}$

where β(t)

p² (t) and r_(ijn) is the number of non-zero elements in the n^(th) row of the N×L sub-matrix A_(ij)=P_(ij)D_(ij). To facilitate a fast implementation, we approximate the quantity H_(l) by

$\begin{matrix} {{{\hat{H}}_{l} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\sum\limits_{n = 1}^{N}\; {{\left( \alpha_{ijl} \right)^{2} \cdot r_{ijn} \cdot \beta}\left( {{nT}_{s} - {n_{ijl}T_{s}}} \right)}}}}},} & (67) \end{matrix}$

We write Ĥ_(l) as

$\begin{matrix} {{{\hat{H}}_{l} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\left( \alpha_{ijl} \right)^{2} \cdot {\hat{H}}_{ijl}}}}}\mspace{14mu} {where}} & (68) \\ \begin{matrix} {{\hat{H}}_{ijl} = {\sum\limits_{n = 0}^{N - 1}\; {r_{ijn} \cdot {\beta \left( {{nT}_{s} - {n_{ijl}T_{s}}} \right)}}}} \\ {= {\sum\limits_{n = 0}^{N - 1}\; {{{\gamma_{ij}\lbrack n\rbrack} \cdot {h\left\lbrack {n - n_{ijl}} \right\rbrack}}(70)}}} \\ {= {\left\{ {\sum\limits_{n = 1}^{N}\; {{\gamma_{ij}\lbrack n\rbrack} \cdot {h\left\lbrack {n - k} \right\rbrack}}} \right\} _{k = n_{ijl}}(71)}} \\ {= {\left\{ {\sum\limits_{n = 1}^{N}\; {{\gamma_{ij}\lbrack n\rbrack} \cdot {h\left\lbrack {k - n} \right\rbrack}}} \right\} _{k = n_{ijl}}(72)}} \\ {= {\left\{ {\left( {\gamma_{ij}*h} \right)\lbrack k\rbrack} \right\} _{k = n_{ijl}}(73)}} \end{matrix} & (69) \end{matrix}$

with h[n]

β(nT_(s)), and r_(ijn) is now represented by the n-indexed sequence γ_(ij)[n]

r_(ijn). For the sake of convenience and consistency, we assume here that rows of a matrix are counted starting from a zeroth row. The computation Ĥ_(ijl) requires the convolution of the squared and discretized pulse h[n] with the sequence γ_(ij)[n]. The computation of Ĥ_(ijl) is significantly accelerated with the introduction of a fast procedure for generating γ_(ij)[n].

First, we recall that the sample γ_(ij)[n] is the number of non-zeros entries in the n^(th) row of the N×L sub-matrix A_(ij). Because the radar system has an ultra wide band, the transmitted pulse p(t) is short. Consequently, the samples of the length-N sequence w[n] are zero (or practically zero) for indexes n such that Int|n|<M and non-zero, otherwise. The parameter M is even and significantly smaller than N. The l^(th) column of A_(ij) coincides with the length-N vector

[α_(ijl) ·p(0−τ_(ijl)), α_(ijl) ·p((T _(s)−τ_(ijl)), . . . , α_(ij) ·p((N−1) T _(s)−τ_(ijl))]^(T).   (74)

The (n, l)-entry of A_(ij) is thus non-zero if

|nT _(s)−τ_(ijl) |≦MT _(s).   (75)

Using (44), the above rule in (75) can be approximated by

|n−n _(ijl) |≦M.   (76)

A computed delay index n_(ijl) is such that 0≦n_(ijl)≦N. Consequently, for computational convenience, we write that the (n, l)-entry of A_(ij) is non-zero if

max(0, n−M)≦n _(ijl)≦min(n+M, N).   (77)

The number γ_(ij)[n] of non-zeros entries in the n^(th) row of A_(ij) is thus equal to the number of elements in the n^(th) row that satisfy (77). A more convenient definition is

γ_(ij) [n]=|

_(n)|  (78)

where |

_(n)| denotes the number of elements in the set

_(n) ={l=1, 2, . . . , L|max(0, n−M)≦n_(ijl)≦min(n+M, N)}.   (79)

The parameter γ_(ij)[n] can be efficiently computed by taking advantage of the hash-table-based fast implementation concept used in (60). First, we write

γ_(ij) [n]=|

_(n) ⁺ |−|

_(n) ⁻|  (80)

where

_(n) ⁺ ={l=1, 2, . . . ,L|n _(ijl)≦min(n+M, N)}  (81)

_(n) ⁻ ={l=1, 2, . . . ,L|n _(ijl)≦max(0, n−M−1)}.   (82)

The expression in (80) is further expanded as

$\begin{matrix} {{\gamma_{ij}\lbrack n\rbrack} = {{\sum\limits_{k = 0}^{\min {({{n + M},N})}}\; {_{k}}} - {\sum\limits_{k = 0}^{\max {({0,{n - M}})}}\; {_{k}}}}} & (83) \end{matrix}$

Finally, we have

γ_(ij) [n]=ν[min(n+M, N)]−ν[max(0, n−M)]  (84)

where

$\begin{matrix} {{v\lbrack m\rbrack} = {\sum\limits_{k = 0}^{m}\; {_{k}}}} & (85) \end{matrix}$

with S_(k)={l=1, 2, . . . , L: n_(ijl) =k}. The inner summation in (85) (and, hence the computation of ν[m]) is efficiently computed using the hash-table-based fast implementation previously discussed and used in (60). Example pseudocode to be run by the processing device for implementation of the proposed algorithm for efficiently computing H_(l) is given below.

▪ Subroutine 2: Pseudocode for computing H_(l) for l = 1, 2, . . . , L for i = 1, 2, . . . , I do  for j = 1, 2, . . . , J do   SET q[k] = 0 for 0 ≦ k < k_(min)   for k = k_(min), k_(min) + 1, . . . , k_(max) do    S_(k) = {l = 1, 2, . . . , L:n_(ijl) = k}    q[k] = |S_(k)| (hash-table-implementation)   end for   for m = 0, 1, . . . , N do     ${\nu \lbrack m\rbrack} = {\sum\limits_{k = 0}^{m}\; {q\lbrack k\rbrack}}$   end for   for n = 0, 1, . . . , N do    γ_(ij)[n] = w[min(n + M, N)] − w[max(0, n − M)]   end for   for l = 1, 2, . . . , L do    Ĥ_(ijl) = {(γ_(ij) * h)[k]}_(k=n) _(ijl)   end for  end for end for ${\hat{H}}_{l} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\left( \alpha_{ijl} \right)^{2} \cdot {\hat{H}}_{ijl}}}}$

Putting together the results for calculating terms G_(l) ^((m)) and H_(l), example pseudocode for the l₁-LS algorithm follows.

▪ Pseudocode for computing l₁-LS algorithm for m = 1, 2, . . . , num_(it) Initialization: x⁽⁰⁾ = {x₁ ⁽⁰⁾, x₂ ⁽⁰⁾, . . . , x_(L) ⁽⁰⁾} for l = 1, 2, . . . , L do  Compute Ĥ_(l) (via Subroutine 2) end for for m = 1, 2, . . . , num_(it) do  for l = 1, 2, . . . , L do   Compute Ĝ_(l) ^((m)) (via Subroutine 1)  end for   $x_{l}^{({m + 1})} = \frac{{{\hat{H}}_{l} \cdot x_{l}^{(m)}} + {\hat{G}}_{l}^{(m)}}{{\hat{H}}_{l} + \frac{\lambda}{\left. 2 \middle| x_{l}^{(m)} \right|}}$ end for

So configured, the described MM-based l₁-LS algorithm is applicable to large-scale, real applications. Although the proposed algorithm effectively estimates reflection coefficients of scenes-of-interest using GPR datasets, the algorithm could be readily applied to a variety of applications where datasets are collected using synthetic aperture imaging measurement principles. When compared to images produced by the DAS or RSM algorithms, the image obtained using the MM-based l₁-LS algorithm is more accurate, is less noisy, and captures the main scatterers in the scene-of-interest while effectively suppressing shadows and side lobes. Although the proposed algorithm is still more computationally expensive than the DAS algorithm, a derived acceleration technique produces a fast-implementation version that is very fast and requires substantially less memory. Moreover, because the algorithm decouples the estimation of individual reflectance coefficients, further computational speed gains are achievable via parallel and GPU processing implementations.

By one approach, the method described above can be implemented as illustrated in FIG. 6, where a processing device receives 605 an initial data set and processes 610 the initial data set by creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve an l₁-regularized least-squares estimation problem associated with a mathematical model of image data from the initial data set. These basic steps can be applied to achieve fast and computationally efficiently prepared image using the estimated image value of individual voxels of the image that can be displayed 615, wherein the initial data set may be sourced from a variety of applications where datasets are collected using synthetic aperture imaging measurement principles. In the GPR context, the method may further include emitting 620 a radar pulse at specified intervals into a scene-of-interest and detecting 625 magnitude of signal reflections from the scene of interest from the radar pulse. Position data is recorded 630 corresponding to individual radar pulse emissions and individual receptions of the signal reflections. The initial data set in this application is created 635 from the position data and detected magnitudes of the signal reflections. Where the method is carried out remote from the vehicle, it is sufficient where the receipt of the initial data set to be processed includes receiving data representing transmission site locations of radar pulses, reception site locations of reception of reflections from the radar pulses, radar-return profiles for pairings of the transmission site locations and the reception site locations, and data samples associated with individual radar-return profiles.

Results for the MM-Based l₁-LS Algorithm

The performance of the MM-based l₁-LS algorithm can be evaluated using a numerical experiment and using a real dataset as obtained using a UWB SIRE apparatus and provided by the US Army Research Laboratory.

With reference to FIG. 7, the numerical experiment demonstrates that the accuracy of the proposed algorithm is comparable to that of existing standard l₁-LS algorithms despite the described computational efficiencies over these algorithms. In the numerical experiment, a length-N data vector y is generated using the standard additive white Gaussian (AWGN) noise model y=Ax+w, where x is a length-L vector of regression coefficients and w is an AWGN vector with variance σ². The numerical values for the parameters are N=500, L=23 and σ²=1. The 500×23 system matrix A is randomly generated. FIG. 7 illustrates the estimation accuracy of the proposed MM-based l₁-LS algorithm as compared to that of previously used algorithms including the LASSO, the shooting, and the standard l₁-LS algorithms. As illustrated, the accuracies of these various algorithms are substantially overlapping meaning that the accuracies are approximately the same. As discussed above, while all algorithms give comparable performance results where the data size is relatively small, when the data size is large such as in a GPR application, only the proposed l₁-LS can produce a result using reasonable computing resources, i.e., processing and memory resources. For instance, the LASSO, the shooting, and the l₁-LS algorithms fail due to memory size limitations.

FIGS. 8-10 illustrate the images provided by application of the MM-based l₁-LS algorithm (FIG. 8) to data collected by UWB SIRE system as compared to the images provided by the DAS (FIG. 9) and RSM (FIG. 10) algorithms as applied to the same data set. This test data set corresponds to measurements taken from I=274 consecutive transmit locations using J=16 receive antennas. The scene-of-interest is of size 65×25 m², and is divided into a grid of 250 voxels in the cross-range direction and 3200 voxels in the down-range direction. The voxels have each 0.1 m in the cross-range direction and 0.02 m in the down-range direction. These parameters and their values are summarized below in Table 1.

TABLE 1 Parameters for the real data Parameter Value Image dimension 25 m (cross-range) by 65 m (range) Voxel size 0.1 m (cross-range) by 0.02 m (range) Number of transmit locations 274 Transmit locations per sub-aperture  43 Sub-aperture dimension (cross-range) 25 m (cross-range) by 2 m (range) 

FIGS. 8-10 show complete images of the scene-of-interest that are generated using the DAS, MM-based l₁-LS, and RSM algorithms. FIG. 9 shows the image obtained using the DAS algorithm. The image has significant side lobes and shadows. Moreover, the presence of background noise is clearly visible. FIG. 10 shows the image obtained using the RSM algorithm. Although, side lobes and shadows are reduced, there is still room for improvement. The image obtained using the proposed l₁-LS algorithm, shown in FIG. 8, is sparser and adequately suppresses both the side lobes and background noise.

Another Approach: MM-Based Least Absolute Deviation (LAD) Algorithm with l₁-Regularization

A second approach to application of the MM principle to processing an image data set includes application of this principle to a different approach to the least squares technique. More specifically, such a method includes creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of an MM principle to solve the l₁-regularized least-absolute deviation (LAD) estimation problem associated with a mathematical model of image data from the initial data set.

The so called l₁-regularized LAD estimation problem is a known approach to the least squares regression analysis. This approach is known to handle outlier data in a better or more robust fashion, but at the cost of increased computational resources. In this approach, the reflectance coefficient vector, which represents objects in the SOI to be detected, is estimated using the l₁-regularized least absolute deviation (l₁-LAD) method:

$\begin{matrix} {{\hat{x} = {{\underset{x}{\arg \; \min}{{y - {Ax}}}_{1}} + {\lambda {x}_{1}}}},} & (86) \end{matrix}$

where λ is the regularization parameter. We solve the optimization problem in (86) using the MM principle as shown by D. R. Hunter and K. Lange, “A tutorial on mm algorithms,” The American Statistician, vol. 58, no. 1, pp. 30 to 37, 2004, which is incorporated herein by reference. The resulting algorithm is straightforward-to-implement, computationally efficient and amenable to parallel (or distributed) implementations.

The MM principle is described above. To solve the GPR image formation problem in (86) using the MM principle, the objective function to be minimized can be written as

φ(x)=φ₁(x)+λφ₂(x)   (87)

where φ₁(x)=∥y−Ax∥₁, φ₂(x)=∥x∥₁, and the regularization parameter λ is positive. A quadratic majorizer for the absolute value function ƒ(x)=|x| is given by De Leeuw and Lange as referenced above. From that result, it directly follows that for any real x^((m)) ≠0

$\begin{matrix} {{Q_{2}\left( {x,x^{(m)}} \right)} = {{\sum\limits_{l = 1}^{L}\; \frac{x_{l}^{2}}{2{x_{l}^{(m)}}}} + {\frac{1}{2}{x_{l}^{(m)}}}}} & (88) \end{matrix}$

is a majorizing function for φ₂(x)=Σ_(l=1) ^(L)|x_(l)| at the point x^((m)).

A majorizing function for φ₁(x) is now constructed by first replacing the absolute value function by De Leeuw and Lange's majorizing function for the absolute value function

$\begin{matrix} \begin{matrix} {{\varphi_{1}(x)} = {{\sum\limits_{k = 0}^{K - 1}\; {{y_{k} - \lbrack{Ax}\rbrack_{k}}}} \leq}} \\ {{{\sum\limits_{k = 0}^{K - 1}\; \frac{\left( {{y_{k} - \lbrack{Ax}\rbrack_{k}}} \right)^{2}}{2{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}}} + {\frac{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}{2}.(90)}}} \end{matrix} & (89) \end{matrix}$

Next, we replace the term (y_(k)−[Ax]_(k)|)² above by a majorizing function developed by De Pierro as referenced above. More specifically, De Pierro showed that ([Ax]_(k))² is majorized by

$\begin{matrix} {{q\left( {x,x^{(m)}} \right)}\overset{\Delta}{=}{\sum\limits_{l = 1}^{L}\; {C_{kl}\left( {{N_{k}A_{kl}x_{l}} - {N_{k}A_{kl}x_{l}^{(m)}} + \left\lbrack {Ax}^{(m)} \right\rbrack_{k}} \right)}^{2}}} & (91) \end{matrix}$

where N_(k) is the number of non-zero elements in the k^(th) row of the system matrix A and C_(kl) is defined by

$\begin{matrix} {C_{kl}\overset{\Delta}{=}\left\{ {\begin{matrix} {N_{k}^{- 1},} & {A_{kl} \neq 0} \\ {0,} & {A_{kl} = 0} \end{matrix}.} \right.} & (92) \end{matrix}$

It then follows that

(y _(k) −[Ax] _(k))² =y _(k) ²−2y _(k) [Ax] _(k)+([Ax] _(k))²   (93)

≦y _(k) ²−2y _(k) [Ax] _(k) +q (x, x ^((m))) .   (94)

where q is given by (28). From (90) and (94) it can be seen that a majorizing function for φ₁ at the point x^((m)) is

$\begin{matrix} {{Q_{1}\left( {x,x^{(m)}} \right)}\overset{\Delta}{=}{{\sum\limits_{k = 0}^{K - 1}\; \frac{y_{k}^{2} - {2{y_{k}\lbrack{Ax}\rbrack}_{k}} + {q\left( {x,x^{(m)}} \right)}}{2{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}}} + {\sum\limits_{k = 0}^{K - 1}\; {\frac{1}{2}{{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}.}}}}} & (95) \end{matrix}$

Because the penalty factor λ is positive, a majorizing function for the objective function φ(x)=φ₁(x)+λφ₂(x) is

Q(x, x ^((m)))=Q ₁(x, x ^((m)))+λQ ₂(x, x ^((m)))   (96)

where Q₂ is given by (31). From the general expression in (17), we obtain the desired iterative algorithm by setting to zero the partial derivatives of Q (x, x^((m))) with respect to the components of x. For l=1, 2, . . . , L;

$\begin{matrix} {\frac{\partial Q}{\partial x_{l}} = {{\frac{\partial}{\partial x_{l}}{\sum\limits_{k = 0}^{K - 1}\; \frac{y_{k}^{2} - {2{y_{k}\lbrack{Ax}\rbrack}_{k}} + {q\left( {x,x^{(m)}} \right)}}{2{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}}}} + {\frac{\partial}{\partial x_{l}}{\sum\limits_{k = 0}^{K - 1}\; {\frac{1}{2}{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}}}} + {{\lambda \cdot \frac{\partial}{\partial x_{l}}}{\sum\limits_{l = 1}^{L}\; {\left\lbrack {\frac{x_{l}^{2}}{2{x_{l}^{(m)}}} + {\frac{1}{2}{x_{l}^{(m)}}}} \right\rbrack.}}}}} & (97) \end{matrix}$

Computing the derivatives in (97), simplifying and re-arranging terms gives

$\begin{matrix} {\frac{\partial Q}{\partial x_{l}} = {{\left( {{\sum\limits_{k = 0}^{K - 1}\; \frac{N_{k}A_{kl}^{2}}{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}} + \frac{\lambda}{2{x_{l}^{(m)}}}} \right) \cdot x_{l}} - {\left( {\sum\limits_{k = 0}^{K - 1}\; \frac{N_{k}A_{kl}^{2}}{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}} \right) \cdot x_{l}^{(m)}} - {\sum\limits_{k = 0}^{K - 1}\; {\frac{A_{kl}\left( {y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}} \right)}{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}.}}}} & (98) \end{matrix}$

Setting the partial derivatives to zero leads to the proposed l₁-LAD algorithm

$\begin{matrix} {{x_{l}^{({m + 1})} = \frac{{D_{l}^{(m)} \cdot x_{l}^{(m)}} + N_{l}^{(m)}}{D_{l}^{(m)} + \frac{\lambda}{x_{l}^{(m)}}}},} & (99) \end{matrix}$

for l=1, 2, . . . , L, and where the terms D_(l) ^((m)) and N_(l) ^((m)) are given by

$\begin{matrix} {D_{l}^{(m)} = {\sum\limits_{k = 0}^{K - 1}\; \frac{N_{k}A_{kl}^{2}}{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}}} & (100) \\ {N_{l}^{(m)} = {\sum\limits_{k = 0}^{K - 1}\; {A_{kl} \cdot {{{sign}\left( {y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}} \right)}.}}}} & (101) \end{matrix}$

Using this approach, a processing device can be configured to iteratively derive from an initial data set an estimated image value for a given voxel. In GPR imaging, the initial data set received by the processing device may include receiving data representing transmission site locations of radar pulses, reception site locations of reception of reflections from the radar pulses, radar-return profiles for pairings of the transmission site locations and reception site locations, and data samples associated with individual radar-return profiles. For instance, the above algorithm can be initialized using the reflectance coefficient estimates obtained from the standard DAS algorithm. In the general setting, the algorithm can be initialized using an arbitrary non-zero vector.

Like with the application of the MM principle to the l₁-LS algorithm, a processing device can be configured to use a fast implementation strategy for computing D_(l) ^((m)) and N_(l) ^((m)) with significant speed gain and minimal data/matrix storage. In one such approach, the mathematical expressions can be modified to reduce processing time and required memory by accounting for a symmetric nature of a given radar pulse, accounting for similar discrete time delays between transmission of a given radar pulse and reception of reflections from the given radar pulse, and accounting for a short duration of the given radar pulse. More specifically, a derivation of the fast implementation similar to that described above can be similarly applied to the equations for D_(l) ^((m)) and N_(l) ^((m)) to allow computing by application of hash-table-based computations, including using slight modifications of the pseudocode described above.

Thus, the fast implementation may include calculating the terms by computing N_(l) ^((m)) by applying a hash-table based computation to

$\begin{matrix} {{q_{ij}\lbrack k\rbrack} = {\sum\limits_{l \in _{k}}\; d_{ijl}^{(m)}}} & (102) \end{matrix}$

More specifically, we can write N_(l) ^((m)) as

$\begin{matrix} {N_{l}^{(m)} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl} \cdot N_{ijl}^{(m)}}}}} & (103) \end{matrix}$

where

N _(ijl) ^((m)) ={w*(sign(y _(ij) −s _(ij) ^((m))))[k]}| _(k=n) _(ijl) ,   (104)

s _(ij) ^((m)) [n]=(q _(ij) *w)[n]  (105)

and y_(ij)[k] is a k^(th) sample of a radar-return profile associated with an i^(th) transmit location and a j^(th) receiver, s_(ij) ^((m))[k] is an m^(th) estimate of a noise-free component of y_(ij)[k], w is a discretized version of the given radar pulse, α_(ijl) represents attenuation of the given radar pulse during travel from an i^(th) transmit location to an l^(th) voxel and back to a j^(th) receiver, and n_(ijl) is a discrete time-delay corresponding to rounding a quotient of time for the given radar pulse to travel from a transmitter at the i^(th) transmit location to the l^(th) voxel and back to the j^(th) receiver and a sampling interval, and * denotes discrete-time convolution.

Similarly, the fast implementation may include calculating the terms by computing D_(l) ^((m)) by applying a hash-table based computation to |S_(k)| where |S_(k)| denotes a number of elements in the set S_(k). More specifically, we can write D_(l) ^((m)) as

$\begin{matrix} {{D_{l}^{(m)} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl}^{2} \cdot D_{ijl}^{(m)}}}}}\mspace{14mu} {where}} & (106) \\ {{D_{ijl}^{(m)} = {\left\{ {\left( {h*\left( \frac{\gamma_{ij}}{{y_{ij} - s_{ij}^{(m)}}} \right)} \right)\lbrack k\rbrack} \right\} _{k = n_{ijl}}}},} & (107) \\ {{{\gamma_{ij}\lbrack n\rbrack} = {{v\left( {\min \left( {{n + M},N} \right)} \right)} - {v\left( {\max \left( {0,{n - M}} \right)} \right)}}},} & (108) \\ {{v(m)} = {\sum\limits_{k = 0}^{M}\; {S_{k}}}} & (109) \end{matrix}$

and where h is a discretized version of a squared radar pulse and 2M +1 is a number of non-zero samples in the given radar pulse.

By one approach, the method described above can be implemented as illustrated in FIG. 11, where a processing device receives 1105 an initial data set and processes 1110 the initial data set by creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve an l₁-regularized least-absolute deviation (LAD) estimation problem associated with a mathematical model of image data from the initial data set. These basic steps can be applied to achieve fast and computationally efficiently prepared image with increased robust handling of data outliers using the estimated image value of individual voxels of the image that can be displayed 1115. Wherein the initial data set may be sourced from a variety of applications where datasets are collected using synthetic aperture imaging measurement principles. In the GPR context, the method may further include emitting 1120 a radar pulse at specified intervals into a scene-of-interest and detecting 1125 magnitude of signal reflections from the scene of interest from the radar pulse. Position data is recorded 1130 corresponding to individual radar pulse emissions and individual receptions of the signal reflections. The initial data set in this application is created 1135 from the position data and detected magnitudes of the signal reflections. Where the method is carried out remote from the vehicle, it is sufficient where the receipt of the initial data set to be processed includes receiving data representing transmission site locations of radar pulses, reception site locations of reception of reflections from the radar pulses, radar-return profiles for pairings of the transmission site locations and the reception site locations, and data samples associated with individual radar-return profiles.

So configured, a l₁-regularized least absolute deviation algorithm with application of the MM principle is easy to implement and robust to outliers. Although discussed herein largely with respect to GPR imaging, the proposed l₁-LAD algorithm is generally applicable to any data that fits a linear model where most of the unknown parameter values are zero. Preliminary results indicate that the described l₁-LAD algorithm adequately estimates the reflectance coefficients to allow for display of objects in the SOI and is noticeably more robust to outliers/spikes than other l₁-regularization algorithms. Although the proposed algorithm is more computationally expensive than some existing algorithms such as the standard DAS and LASSO algorithms, substantial gains in computational speed and memory-usage are attainable via application of the fast implementation techniques. Furthermore, because the estimation of reflectance coefficients is decoupled, parallel and/or distributed processing implementations can also be applied to increase computational speed.

Results for the MM-Based l₁-LAD Algorithm

The proposed MM-based l₁-LAD algorithm was tested using a numerical experiment and simulated GPR data that closely mimics the measurements generated by the UWB SIRE system. The numerical experiment is used to illustrate, in the general setting, the robustness to outliers in the data that is processed by the various algorithms. To perform this test, a length-N data vector y is generated using the standard additive white Gaussian noise model y=Ax+w, where x is a length-L vector of regression coefficients and w is an AWGN vector with variance σ². The numerical values for the above parameters are N=500, L=25 and σ²=1. The 500×23 system matrix A is randomly generated. FIGS. 12 and 13 show respectively the estimation results of the known DAS and LASSO algorithms and the MM-based l₁-LAD algorithm when a single erroneous outlier/spike is inserted into the the observation data y.

FIG. 12 illustrates that the estimation performance of the DAS and LASSO approaches of the l₁-LS algorithms can be degraded by the presence of a single significant outlier. These numerical experimentations indicated that the level of inaccuracy in the l₁-LS estimate is commensurate with the number and magnitude of outliers.

In contrast, FIG. 13 shows that the proposed l₁-LAD approach is immune to the presence of the outlier as it properly estimates the regression coefficients. FIG. 14 illustrates that without outliers, the l₁-LAD and the DAS and RSM versions of the l₁-LS algorithms give adequate and comparable results p In the outlier example, the iterative procedure of the described l₁-LAD algorithm was initialized with arbitrary/random values. The coefficients are estimated using 5000 iterations, although analysis of the algorithm's cost function of FIG. 15 demonstrates that a lesser number of iterations would have sufficed. FIG. 15 also illustrates that the cost function is monotonically decreasing.

FIGS. 16 and 17 illustrate respectively images resulting from application of the DAS algorithm and the MM based l₁-LAD algorithm to the ARL simulated GPR data. FIG. 17 demonstrates removal of noise using the l₁-LAD algorithm as compared to FIG. 16's image created using the DAS algorithm. Moreover, the undesirable shadow-effects in the vicinity of scatterers, which are seen in the DAS image, are absent in the l₁-LAD image. Additionally, the described l₁-LAD algorithm retains the above results when erroneously large values are randomly added to the ARL dataset to mimic potential outliers and qualitatively test the robustness of the algorithm.

Another Approach: Application of MM-Principle to l₁-Regularization of a DAS Derived Data Set

A third approach to application of the MM principle to processing an image data set includes application to the l₁-regularized least squares problem within an image data set created through use of the DAS algorithm. More specifically, and with reference to FIG. 18, one such method includes receiving 1805 a DAS image data set created by applying a delay-and-sum (DAS) algorithm to an initial data set. The DAS image data set is processed 1810 with a processing device by creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve an l₁-regularized least-squares estimation problem that selects a sparse image de-rived from the DAS image data set. This approach can be considered an l₁-Sparsity Improvement Restoration (l₁-SIR) algorithm.

These basic steps can be applied to achieve fast and computationally efficient image preparation to obtain the estimated image value of individual voxels of the image for display 1815. In the GPR context, the method may further include, when carried out on or local to the vehicle, emitting 1820 a radar pulse at specified intervals into a scene-of-interest and detecting 1825 magnitude of signal reflections from the scene of interest from the radar pulse. Position data is recorded 1830 corresponding to individual radar pulse emissions and individual receptions of the signal reflections. The initial data set in this application is created 1835 from the position data and detected magnitudes of the signal reflections. The DAS algorithm is applied 1840 to the initial data set to create the DAS image data set. Where the method is carried out remote from the vehicle, it is sufficient where the receipt of the image data set to be processed includes receiving data representing transmission site locations of radar pulses, reception site locations of reception of reflections from the radar pulses, radar-return profiles for pairings of the transmission site locations and the reception site locations, and data samples associated with individual radar-return profiles.

More specifically, let X_(DAS) be the DAS image for some SOL We propose to reconstruct an improved image by minimizing the following l₁ regularized LS objective function

$\begin{matrix} {\hat{x} = {{\underset{x}{\arg \; \min}{{x_{DAS} - x}}_{2}^{2}} + {\lambda {x}_{1}}}} & (110) \end{matrix}$

where λ>0 is the penalty or regularization parameter.

The optimization problem in (110) in this approach is solved using the MM principle as described above. The resulting algorithm is straightforward-to-implement, computationally efficient and amenable to parallel (or distributed) implementations.

A quadratic majorizing function for the absolute value function ƒ(x)=|x| is given by De Leeuw and Lange as referenced above. From their result, it follows that for any real L×1 vector x^((m)) without any zero elements a majorizing function for the function ∥x∥₁ at the point x^((m)) is

$\begin{matrix} {{q\left( {x,x^{(m)}} \right)}\overset{\Delta}{=}{\sum\limits_{l = 1}^{L}\; {\left( {\frac{x_{l}^{2}}{2{x_{l}^{(m)}}} + {\frac{1}{2}{x_{l}^{(m)}}}} \right).}}} & (111) \end{matrix}$

In turn, it follows that a majoring function for the objective function in (110) at the point x^((m)) is

Q(x,x ^((m)))

x _(DAS) −x∥ ₂ ² +λq(x, x ^((m))).   (112)

Noting the general expression in (17), the remaining steps are to compute the partial derivatives of the majorizing function Q and set the results to zero. For l=1, 2, . . . , L, the partial derivative of Q with respect to x₁ is equal to

$\begin{matrix} \begin{matrix} {\frac{\partial Q}{\partial x_{l}} = {{\frac{\partial}{\partial x_{l}}{\sum\limits_{s = 1}^{L}\; \left( {x_{{DAS},s} - x_{s}} \right)^{2}}} +}} \\ {{{\lambda \cdot \frac{\partial}{\partial x_{l}}}{\sum\limits_{s = 1}^{L}\; \left( {\frac{x_{s}^{2}}{2{x_{s}^{(m)}}} + {\frac{1}{2}{x_{s}^{(m)}}}} \right)}}} \\ {= {{{- 2}\left( {x_{{DAS},l} - x_{l}} \right)} + {\lambda {\frac{x_{l}}{x_{l}^{(m)}}.(114)}}}} \end{matrix} & (113) \end{matrix}$

Setting the partial derivatives to zero leads to the proposed l₁-SIR algorithm

$\begin{matrix} {x_{l}^{({m + 1})} = {\frac{x_{{DAS},l}}{1 + \frac{\lambda}{2{x_{l}^{(m)}}}}.}} & (115) \end{matrix}$

As in the other approaches, one may iteratively derive an estimated image value for a given voxel using the immediately above equation for x_(l) ^((m+1)). Accordingly, a processing device may be configured to process a DAS image data set by creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve an l₁-regularized least-squares estimation problem that selects a sparse image derived from the DAS image data set.

Results for the MM-Based l₁-LS Algorithm Applied to DAS Image Data Set

The performance of the MM-based l₁-LS algorithm as applied to a DAS image data set can be evaluated using a real dataset as obtained using a UWB SIRE apparatus and provided by the US Army Research Laboratory. FIGS. 19-21 show images derived using the DAS, l₁-SIR, and the MM based l₁-LS algorithms. Note, the l₁-LS image was reconstructed using the algorithm described in the first approach above. From a comparison of the figures, the images created using the l₁-SIR and l₁-LS algorithms images are comparable in terms of the level of sparsity and ability to resolve known targets in the SOL However, using the MATLAB software page, the l₁-SIR image takes about 1.5 minutes to create while the l₁-LS image requires approximately 2.5 hours to reconstruct.

So configured, the l₁-SIR algorithm is computationally efficient and only takes approximately 5% of the time required by the DAS algorithm. In studies using real data, the l₁-SIR images are an improvement over the DAS images in that they have reduced clutter and improved sparsity without a loss of known scatterers. Additionally, the l₁-SIR images in the studies were comparable to the l₁-LS images. However, the l₁-SIR algorithm only takes 1% of the computational time of an l₁-LS algorithm that is also based on the MM principle. Moreover, it is contemplated that the MM principle as applied to an l₁-least squares estimate can be extended for application to image data sets created by other known algorithms to reduce noise in resulting images.

Those skilled in the art will recognize that a wide variety of modifications, alterations, and combinations can be made with respect to the above described embodiments without departing from the scope of the invention, and that such modifications, alterations, and combinations are to be viewed as being within the ambit of the inventive concept. 

1. A method of creating an image from an initial data set, the method comprising: receiving the initial data set; processing the initial data set with a processing device by creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve an l₁-regularized least-squares estimation problem associated with a mathematical model of image data from the initial data set; displaying the image using the estimated image value of individual voxels of the image.
 2. The method of claim 1 wherein the receiving the initial data set comprises receiving data representing transmission site locations of radar pulses, reception site locations of reception of reflections from the radar pulses, radar-return profiles for pairings of the transmission site locations and the reception site locations, and data samples associated with individual radar-return profiles.
 3. The method of claim 1 further comprising iteratively deriving an estimated image value for a given voxel using f or l=1, 2, . . . , L ${x_{l}^{({m + 1})} = \frac{{H_{l}x_{l}^{(m)}} + G_{l}^{(m)}}{H_{l} + \frac{\lambda}{2{x_{l}^{(m)}}}}}\mspace{14mu}$ where $H_{l} = {\sum\limits_{k = 0}^{K - 1}\; {r_{k}A_{kl}^{2}}}$ $G_{l}^{(m)} = {\sum\limits_{k = 0}^{K - 1}{A_{kl}\left( {y_{k} - \left\lbrack {Ax}^{\lbrack m\rbrack} \right\rbrack_{k}} \right)}}$ and where x_(l) ^((m+1)) is the estimated image value at an l^(th) voxel for iterate number (m+1), x_(l) ^((m)) is the estimated image value at an l^(th) voxel for iterate number (m), A is a K×L system matrix associated with the mathematical model of the image data, K=IJL where I is a number of transmission site locations, J is a number of reception sites for each transmit site location, L is a number of pixels in the image, r_(k) is a number of nonzero elements in a k^(th) row of the matrix A , and y_(k) is a k^(th) K sample of the initial data set.
 4. The method of claim 2 further comprising calculating terms used to obtain the estimated image value, wherein the calculating reduces processing time and required memory by accounting for a symmetric nature of a given radar pulse, accounting for similar discrete time delays between transmission of a given radar pulse and reception of reflections from the given radar pulse, and accounting for a short duration of the given radar pulse.
 5. The method of claim 4 wherein the calculating the terms comprises: computing G_(l) ^((m)) by applying a hash-table-based computation to ${q_{ij}\lbrack k\rbrack} = {\sum\limits_{l \in _{k}}\; d_{ijl}^{(m)}}$ where d_(ijl) ^((m)=α) _(ijl)x_(l) ^((m)) and S_(k)={l=1, 2, . . . , L: n_(ijl)=k} and α_(ijl) represents attenuation of the given radar pulse during travel from an i^(th) transmit location to an l^(th) voxel and back to a j^(th) receiver.
 6. The method of claim 5 further comprising computing G_(l) ^((m)) via i=1, 2, . . . , I; j=1, 2, . . . , J; l=1, 2, . . . , L ${G_{l}^{(m)} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl}G_{ijl}^{(m)}}}}}\mspace{14mu}$ where G_(ijl)^((m)) = {(w * (y_(ij) − s_(ij)^((m))))[k]}_(k = n_(ijl))s_(ij)^((m))[n] = (q_(ij) * w)[n] and where y_(ij)[k] is a k^(th)sample of a radar-return profile associated with an i^(th) transmit location and j^(th) receiver, s_(ij) ^((m))[k] is the m^(th) estimate of a noise-free component of y_(ij) [k], w is a discretized version of the given radar pulse, and n_(ijl) is a discrete time-delay corresponding to rounding the quotient of the time taken by the given radar pulse to travel from a transmitter at the i^(th) transmit location to the l^(th) voxel and back to the j^(th) receiver and a sampling interval, and * denotes discrete-time convolution.
 7. The method of any of claim 4 wherein the calculating of the terms comprises: computing H_(l) by applying a hash-table-based computation to |S_(k)| where |S_(k)| denotes a number of elements in the set S_(k).
 8. The method of claim 7 wherein the computing H_(l) further comprises computing H_(l) via i=1, 2, . . . , I; j=1,2, . . . , J; l=1, 2, . . . , L ${H_{l} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl}^{2}H_{ijl}}}}}\mspace{14mu}$ where ${H_{ijl} = {{\left( {\gamma_{ij}*h} \right)\lbrack k\rbrack}_{k = n_{ijl}}}},{{\gamma_{ij}\lbrack n\rbrack} = {v\left( {{\min \left( {{n + M},N} \right)} - {v\left( {\max \left( {0,{n - M}} \right)} \right)}} \right)}},{{v(m)} = {\sum\limits_{k = 0}^{m}\; {S_{k}}}}$ and where h is a discretized version of a squared radar pulse and 2M +1 is a number of non-zero samples in the given radar pulse.
 9. The method of claim 1 further comprising: emitting a radar pulse at specified intervals into a scene of interest; detecting magnitude of signal reflections from the scene of interest from the radar pulse; recording position data corresponding to individual radar pulse emissions and individual receptions of the signal reflections; and creating the initial data set from the position data and detected magnitudes of the signal reflections.
 10. A method of creating an image from an initial data set, the method comprising: receiving the initial data set; processing the initial data set with a processing device by: creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve the l₁-regularized least-absolute deviation estimation problem associated with a mathematical model of image data from the initial data set; displaying the image using the estimated image value of individual voxels of the image.
 11. The method of claim 10 wherein the receiving the initial data set comprises receiving data representing transmission site locations of radar pulses, reception site locations of reception of reflections from the radar pulses, radar-return profiles for pairings of the transmission site locations and reception site locations, and data samples associated with individual radar-return profiles.
 12. The method of claim 10 further comprising iteratively deriving an estimated image value for a given voxel using for  l = 1, 2, …  , L ${{x_{l}^{({m + 1})} = {{\frac{{D_{l}^{(m)}x_{l}^{(m)}} + N_{l}^{(m)}}{D_{l}^{(m)} + \frac{\lambda}{x_{l}^{(m)}}}\mspace{14mu} {for}\mspace{14mu} l} = 1}},2,\ldots \mspace{14mu},L}\mspace{14mu}$ where $D_{l}^{(m)} = {\sum\limits_{k = 0}^{K - 1}\; \frac{r_{k}A_{kl}^{2}}{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}}$ $N_{l}^{(m)} = {\sum\limits_{k = 0}^{K - 1}\; {A_{kl} \cdot {{sign}\left( {y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}} \right)}}}$ and where x_(l) ^((m+1)) is the estimated image value at an l^(th) voxel for iterate number (m+1), x_(l) ^((m)) is the estimated image value at an l^(th) voxel for iterate number (m), A is a K×L system matrix associated with the mathematical model of the image data, r_(k) is a number of nonzero elements in a k^(th) row of the matrix A, and y_(k) is a k^(th) K sample of the initial data set.
 13. The method of claim 11 further comprising calculating terms used to obtain the estimated image value, wherein the calculating the terms reduces processing time and required memory by accounting for a symmetric nature of a given radar pulse, accounting for similar discrete time delays between transmission of a given radar pulse and reception of reflections from the given radar pulse, and accounting for a short duration of the given radar pulse.
 14. The method of claim 13 wherein the calculating the terms comprises: computing N_(l) ^((m)) by applying a hash-table-based computation to ${q_{ij}\lbrack k\rbrack} = {\sum\limits_{l \in _{k}}\; d_{ijl}^{(m)}}$
 15. The method of claim 14 wherein the computing the term N_(l) ^((m)) further comprises computing N_(l) ^((m)) via i=1,2, . . . , I; j=1,2, . . . , J; l=1,2, . . . , L; ${N_{l}^{(m)} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl}N_{ijl}^{(m)}}}}}\mspace{14mu}$ where N_(ijl)^((m)) = {w * (sign(y_(ij) − s_(ij)^((m))))[k]}_(k = n_(ijl)), s_(ij)^((m))[n] = (q_(ij) * w)[n], and y_(ij) is a k^(th) sample of a radar-return profile associated with an i^(th) transmit location and a j^(th) receiver, s_(ij) ^((m))[k] is an m^(th) estimate of a noise-free component of y_(i j)[k], w is a discretized version of the given radar pulse, α_(ijl) represents attenuation of the given radar pulse during travel from an i^(th) transmit location to an l^(th) voxel and back to a j^(th) receiver, and n_(ijl) is a discrete time-delay corresponding to rounding a quotient of time for the given radar pulse to travel from a transmitter at the i^(th) transmit location to the l^(th) voxel and back to the j^(th) receiver and a sampling interval, and * denotes discrete-time convolution.
 16. The method of claim 13 wherein the calculating the terms comprises: computing D_(l) ^((m)) by applying a hash-table-based computation to |S_(k)| where |S_(k)| denotes a number of elements in the set S_(k).
 17. The method of claim 16 wherein the computing the term D_(l) ^((m)) further comprises computing D_(l) ^((m)) via for i=1,2, . . . , I; j=1,2, . . . , J; l=1,2, . . . , L ${D_{l}^{(m)} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl}^{2}D_{ijl}^{(m)}}}}}\mspace{14mu}$ where ${D_{ijl}^{(m)} = {{\left\{ \left( {\left( \frac{\gamma_{ij}}{{y_{ij} - s_{ij}^{(m)}}} \right)*h} \right) \right\} \lbrack k\rbrack}_{k = n_{ijl}}}},{{\gamma_{ij}\lbrack n\rbrack} = {v\left( {{\min \left( {{n + M},N} \right)} - {v\left( {\max \left( {0,{n - M}} \right)} \right)}} \right)}},{{v(m)} = {\sum\limits_{k = 0}^{m}\; {S_{k}}}}$ and where h is a discretized version of a squared radar pulse and 2M +1 is a number of non-zero samples in the given radar pulse.
 18. The method of claim 11 further comprising: emitting a radar pulse at specified intervals into a scene of interest; detecting magnitude of signal reflections from the scene of interest from the radar pulse; and recording position data corresponding to individual radar pulse emissions and individual receptions of the signal reflections; creating the initial data set from the position data and detected magnitudes of the signal reflections.
 19. An apparatus for detecting objects in a scene of interest, the apparatus comprising: a vehicle; a plurality of radar transmission devices mounted on the vehicle and configured to transmit radar pulses into a scene of interest; a plurality of radar reception devices mounted on the vehicle configured to detect magnitude of signal reflections from the scene of interest from the radar pulses; a location determination device configured to detect location of the vehicle at times of transmission of the radar pulse from the plurality of radar transmission devices and reception of the signal reflections by the radar reception devices; a processing device configured to process an initial data set representing transmission site locations of individual ones of the radar pulses, reception site locations of reception of individual ones of the signal reflections, and number of data samples per reception profile by: creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve an l₁-regularized least-squares estimation problem associated with a mathematical model of image data from the initial data set.
 20. The apparatus of claim 19 wherein the processing device is further configured to iteratively derive an estimated image value for a given voxel using f or l=1,2, . . . , L ${x_{l}^{({m + 1})} = \frac{{H_{l}x_{l}^{(m)}} + G_{l}^{(m)}}{H_{l} + \frac{\lambda}{{2x_{l}^{(m)}}}}}\mspace{14mu}$ where $H_{l} = {\sum\limits_{k = 0}^{K - 1}\; {r_{k}A_{kl}^{2}}}$ $G_{l}^{(m)} = {\sum\limits_{k = 0}^{K - 1}\; {A_{kl}\left( {y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}} \right)}}$ where x_(l) ^((m+1)) is the estimated image value at an l^(th) voxel for iterate number (m+1), x_(l) ^((m)) is the estimated image value at the l^(th) voxel for iterate number (m), A is a K×L system matrix associated with the mathematical model of the image data, K=IJL where I is a number of transmission site locations, J is a number of reception sites, and L is a number of pixels in the image, r_(k) is a number of nonzero elements in a k^(th) row of the matrix A , and y_(k) is a k^(th) sample of the initial data set.
 21. The apparatus of claim 19 wherein the processing device is further configured to calculate terms used to obtain the estimated image value, wherein the calculating reduces processing time and required memory by accounting for a symmetric nature of a given radar pulse, accounting for similar discrete time delays between transmission of a given radar pulse and reception of reflections from the given radar pulse, and accounting for a short duration of the given radar pulse.
 22. The apparatus of claim 21 wherein the processing device is further configured to calculate terms by computing G_(l) ^((m)) by applying a hash-table-based computation to ${q_{ij}\lbrack k\rbrack} = {\sum\limits_{l \in _{k}}\; d_{ijl}^{(m)}}$ where d_(ijl) ^((m)=α) _(ijl)x_(l) ^((m)) and S_(k)={l=1,2, . . . , L: n_(ijl)=k} and α_(ijl) represents attenuation of the given radar pulse during travel from an i^(th) transmit location to an l^(th) voxel and back to a j^(th) receiver.
 23. The apparatus of claim 22 wherein the processing device is further configured to compute G_(l) ^((m)) via i=1,2, . . . , I; j=1,2, . . . , J ; l=1,2, . . . , L ${G_{l}^{(m)} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl}G_{ijl}^{(m)}}}}}\mspace{14mu}$ where G_(ijl)^((m)) = {(w * (y_(ij) − s_(ij)^((m))))[k]}_(k = n_(ijl)), s_(ij)^((m))[n] = (q_(ij) * w)[n], and where y_(ij)[k] is a k^(th) sample of a radar-return profile associated with an i^(th) transmit location and j^(th) receiver, s_(ij) ^((m))[k] is the m^(th) estimate of a noise-free component of y_(ij)[k], w is a discretized version of the given radar pulse, and n_(ijl) is a discrete time-delay corresponding to rounding a quotient of time for the given radar pulse to travel from a transmitter at the i^(th) transmit location to the l^(th) voxel and back to the j^(th) receiver and a sampling interval, and * denotes discrete-time convolution.
 24. The apparatus of any of claim 21 wherein the processing device is further configured to calculate terms by computing H_(l) by applying a hash-table-based computation to |S_(k)| where |S_(k)| denotes a number of elements in the set S_(k).
 25. The apparatus of claim 24 wherein the processing device is further configured to compute H_(l) by computing H_(l) via i=1,2, . . . , I; j=1,2, . . . , J; l=1,2, . . . , L ${H_{l} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl}^{2}H_{ijl}}}}}\mspace{14mu}$ where ${H_{ijl} = {{\left\{ \left( {\gamma_{ij}*h} \right) \right\} \lbrack k\rbrack}_{k = n_{ijl}}}},{{\gamma_{ij}\lbrack n\rbrack} = {v\left( {{\min \left( {{n + M},N} \right)} - {v\left( {\max \left( {0,{n - M}} \right)} \right)}} \right)}},{{v(m)} = {\sum\limits_{k = 0}^{m}\; {S_{k}}}}$ where h is a discretized version of a squared radar pulse and 2M+1 is a number of non-zero samples in the given radar pulse.
 26. An apparatus for detecting objects in a scene of interest, the apparatus comprising: a vehicle; a plurality of radar transmission devices mounted on the vehicle and configured to transmit radar pulses into a scene of interest; a plurality of radar reception devices mounted on the vehicle configured to detect magnitude of signal reflections from the scene of interest from the radar pulses; a location determination device configured to detect location of the vehicle at times of transmission of the radar pulse from the plurality of radar transmission devices and reception of the signal reflections by the radar reception devices; a processing device configured to process an initial data set representing transmission site locations of individual ones of the radar pulses, reception site locations of reception of individual ones of the signal reflections, and number of data samples per reception profile by: creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve the l₁-regularized least-absolute deviation estimation problem associated with a mathematical model of image data from the initial data set.
 27. The apparatus of claim 26 wherein the processing device is further configured to iteratively derive an estimated image value for a given voxel using f or l=1,2, . . . , L ${{x_{l}^{({m + 1})} = {{\frac{{D_{l}^{(m)}x_{l}^{(m)}} + N_{l}^{(m)}}{D_{l}^{(m)} + \frac{\lambda}{x_{l}^{(m)}}}\mspace{14mu} {for}\mspace{14mu} l} = 1}},2,\ldots \mspace{14mu},L}\mspace{14mu}$ where $D_{l}^{(m)} = {\sum\limits_{k = 0}^{K - 1}\; \frac{r_{k}A_{kl}^{2}}{{y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}}}}$ $N_{l}^{(m)} = {\sum\limits_{k = 0}^{K - 1}\; {A_{kl} \cdot {{sign}\left( {y_{k} - \left\lbrack {Ax}^{(m)} \right\rbrack_{k}} \right)}}}$ where x_(l) ^((m+1)) is the estimated image value at an l^(th) voxel for iterate number (m+1), x_(l) ^((m)) is the estimated image value at an l^(th) voxel for iterate number (m), A is a K×L system matrix associated with the mathematical model of the image data, r_(k) is a number of nonzero elements in a k^(th) row of the matrix A, and y_(k) is a k^(th) sample of the initial data set.
 28. The apparatus of claim 26 wherein the processing device is further configured to calculate terms used to obtain the estimated image value, wherein the calculating the terms reduces processing time and required memory by accounting for a symmetric nature of a given radar pulse, accounting for similar discrete time delays between transmission of a given radar pulse and reception of reflections from the given radar pulse, and accounting for a short duration of the given radar pulse.
 29. The apparatus of claim 28 wherein the processing device is further configured to calculate terms by computing N_(l) ^((m)) by applying a hash-table-based computation to ${q_{ij}\lbrack k\rbrack} = {\sum\limits_{l \in _{k}}\; {d_{ijl}^{(m)}.}}$
 30. The apparatus of claim 29 wherein the processing device is further configured to compute N_(l) ^((m)) via i=1,2, . . . , l; j=1,2, . . . , J; l=1,2, . . . , L; ${N_{l}^{(m)} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl}N_{ijl}^{(m)}}}}}\mspace{14mu}$ where N_(ijl)^((m)) = {w * (sign(y_(ij) − s_(ij)^((m))))[k]}_(k = n_(ijl)), s_(ij)^((m))[n] = (q_(ij) * w)[n], and y_(ij) is a k^(th) sample of a radar-return profile associated with an ith transmit location and a j^(th) receiver,s_(ij) ^((m))[k] is an m^(th) estimate of a noise-free component of y_(i j) [k], w is a discretized version of the given radar pulse, α_(ijl) represents attenuation of the given radar pulse during travel from an i^(th) transmit location to an l^(th) voxel and back to a j^(th) receiver, and n_(ijl) is a discrete time-delay corresponding to rounding a quotient of time for the given radar pulse to travel from a transmitter at the i^(th) transmit location to the l^(th) voxel and back to the j^(th) receiver and a sampling interval, and * denotes discrete-time convolution.
 31. The apparatus of claim 28 wherein the processing device is further configured to calculate terms by computing D_(l) ^((m)) by applying a hash-table-based computation to |S_(k)| where |S_(k)| enotes a number of elements in the set S_(k).
 32. The apparatus of claim 31 wherein the processing device is further configured to calculate terms by computing D_(l) ^((m)) via for i=1,2, . . . , I; j=1,2, . . . , J; l=1,2, . . . , L ${D_{l}^{(m)} = {\sum\limits_{i = 1}^{I}\; {\sum\limits_{j = 1}^{J}\; {\alpha_{ijl}^{2}D_{ijl}^{(m)}}}}}\mspace{14mu}$ where ${D_{ijl}^{(m)} = {{\left\{ \left( {\left( \frac{\gamma_{ij}}{{y_{ij} - s_{ij}^{(m)}}} \right)*h} \right) \right\} \lbrack k\rbrack}_{k = n_{ijl}}}},{{\gamma_{ij}\lbrack n\rbrack} = {v\left( {{\min \left( {{n + M},N} \right)} - {v\left( {\max \left( {0,{n - M}} \right)} \right)}} \right)}},{{v(m)} = {\sum\limits_{k = 0}^{m}\; {S_{k}}}}$ where h is a discretized version of a squared radar pulse and 2M+1 is a number of non-zero samples in the given radar pulse.
 33. A method of creating an image from a data set, the method comprising: receiving a DAS image data set created by applying a delay-and-sum (DAS) algorithm to create an initial data set; processing the DAS image data set with a processing device by creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve an l₁-regularized least-squares estimation problem that selects a sparse image derived from the DAS image data set; displaying the image using the estimated image value of individual voxels of the image.
 34. The method of claim 33 wherein the receiving the DAS image data set comprises receiving data representing transmission site locations of radar pulses, reception site locations of reception of reflections from the radar pulses, radar-return profiles for pairings of the transmission site locations and the reception site locations, and data samples associated with individual radar-return profiles.
 35. The method of claim 33 further comprising iteratively deriving an estimated image value for a given voxel using f or l=1,2, . . . , L $x_{l}^{({m + 1})} = \frac{x_{{DAS},l}}{1 + \frac{\lambda}{2{x_{l}^{(m)}}}}$ and where x_(l) ^((m+1)) is the estimated image value at an l^(th) voxel for iterate number (m+1), x_(l) ^((m)) is the estimated image value at an l^(th) voxel for iterate number (m), λ is a penalty parameter, and x_(DAS, l) is a value of the l^(th) voxel of the DAS image data set.
 36. The method of claim 33 further comprising: emitting a radar pulse at specified intervals into a scene of interest; detecting magnitude of signal reflections from the scene of interest from the radar pulse; recording position data corresponding to individual radar pulse emissions and individual receptions of the signal reflections; creating the initial data set from the position data and detected magnitudes of the signal reflections; and applying the delay-and-sum (DAS) algorithm to the initial data set to create the DAS image data set.
 37. An apparatus for detecting objects in a scene of interest, the apparatus comprising: a vehicle; a plurality of radar transmission devices mounted on the vehicle and configured to transmit radar pulses into a scene of interest; a plurality of radar reception devices mounted on the vehicle configured to detect magnitude of signal reflections from the scene of interest from the radar pulses; a location determination device configured to detect location of the vehicle at times of transmission of the radar pulse from the plurality of radar transmission devices and reception of the signal reflections by the radar reception devices; a processing device configured to process an initial data set representing transmission site locations of individual ones of the radar pulses, reception site locations of reception of individual ones of the signal reflections, and number of data samples per reception profile by: applying a delay-and-sum (DAS) algorithm to the initial data set to create a DAS image data set, creating an estimated image value for each voxel in the image by iteratively deriving the estimated image value through application of a majorize-minimize principle to solve an l₁-regularized least-squares estimation problem that selects a sparse image derived from the DAS image data set.
 38. The apparatus of claim 37 wherein the processing device is further configured to iteratively derive an estimated image value for a given voxel using f or l=1,2, . . . , L $x_{l}^{({m + 1})} = \frac{x_{{DAS},l}}{1 + \frac{\lambda}{2{x_{l}^{(m)}}}}$ and where x_(l) ^((m+1)) is the estimated image value at an l^(th) voxel for iterate number (m+1), x_(l) ^((m)) is the estimated image value at an l^(th) voxel for iterate number (m), λ is a penalty parameter, and x_(DAS, l) is a value of the l^(th) voxel of the DAS image data set. 